Re: More ideas for speeding up sorting

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: More ideas for speeding up sorting
Date: 2016-09-09 20:54:06
Message-ID: CAM3SWZTXfOx48Ub7Ytp8Cb4q3KSy+oGXRfUtvtf5dko1qRTG_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 9, 2016 at 6:14 AM, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> A few things caught my eye while hacking on the latest round of sorting
> patches.

This is how it begins. :-)

> Starting a new thread because these are orthogonal to the things
> discussed on the other threads:
>
> 1. SortTuple.tupindex is not used when the sort fits in memory. If we also
> got rid of the isnull1 flag, we could shrink SortTuple from 24 bytes to 16
> bytes (on 64-bit systems). That would allow you to pack more tuples into
> memory, which generally speeds things up. We could for example steal the
> least-significant bit from the 'tuple' pointer for that, because the
> pointers are always at least 4-byte aligned. (But see next point)

I've wanted to do that for a long time. I'd rather make SortTuples 16
bytes and be done with it, rather than introduce variations, mostly
because I now think it's possible without any real downside. Maybe
your preread patch gets us to a place where the tupindex field is
useless, because there is no chaining of SortTuples during merging, no
free list of the "slot" SortTuples, and so on. Maybe we could kill
replacement selection fully, so it no longer needs tupindex, leaving
us with no need for it entirely (not just when everything still fits
in memory). But even if we don't kill replacement selection entirely,
we still only need one bit these days, since in practice there are
only two distinct run numbers ever represented in tupindex these days
(RUN_FIRST, and HEAP_RUN_NEXT). We can spare a second bit from the
"tuple" pointer, I suppose.

I know that in practice, even virtual address space is significantly
less than 64-bits on x86_64, so I think that there should be several
bits there for the taking, even if the addresses are not aligned
(although, they are). However, I have no idea if there is any
precedent for this general idea at all, and would feel better about it
if there was. I guess we don't have to make it any more complicated
than your argument about alignment.

> 2. We could easily categorize incoming tuples as the come in, into those
> that have a leading NULL, and others. If we kept them in separate arrays, or
> perhaps grow memtuples from bottom-up for non-NULLs and from top-down for
> NULLs, we wouldn't need to store, or compare, the 'isnull1' flag at all, in
> the quicksort.'

We also wouldn't need to use datum1 to store nothing, at least in the
"state.tuples" (non-datum-pass-by-value) case. Where the leading
attribute is NULL, we can "discard" it, and sort those tuples
separately, and return them to caller first or last as required.

> 3. In the single-Datum case, i.e. tuplesort_begin_datum(), we could just
> keep a counter of how many NULLs we saw, and not store them at all. That's
> quite a special case, but it's simple enough that it might be worth it.

A lot hinges on the performance of this special case. For example,
CREATE INDEX CONCURRENTLY is quite sensitive to it in practice, since
the TID sort is (even in 9.6) the major reason for it being slower
than plain CREATE INDEX (I think that the second pass over the heap is
less important most of the time -- you've seen how sorting isn't I/O
bound much of the time for yourself, so you can imagine). Even still,
I'd be inclined to try to get SortTuples down to 16 bytes, at which
point this special case hardly needs to be considered (we just skip
the quicksort iff it's a single attribute sort, which includes
MinimalTuple cases where there happens to only be one attribute that
we sort on).

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-09-09 21:01:14 Re: More ideas for speeding up sorting
Previous Message Peter Eisentraut 2016-09-09 20:34:08 Re: Logical Replication WIP