Re: Considering additional sort specialisation functions for PG16

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Considering additional sort specialisation functions for PG16
Date: 2023-01-27 06:56:29
Message-ID: CAFBsxsGLGOEmA8zQS+rQqu+1s57OoDHqtMkoUPdju5DOOK4CbQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 26, 2023 at 7:15 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> On Thu, 26 Jan 2023 at 23:29, John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
wrote:
> > Coming back to this, I wanted to sketch out this idea in a bit more
detail.
> >
> > Have two memtuple arrays, one for first sortkey null and one for first
sortkey non-null:
> > - Qsort the non-null array, including whatever specialization is
available. Existing specialized comparators could ignore nulls (and their
ordering) taking less space in the binary.
> > - Only if there is more than one sort key, qsort the null array.
Ideally at some point we would have a method of ignoring the first sortkey
(this is an existing opportunity that applies elsewhere as well).
> > - To handle two arrays, grow_memtuples() would need some adjustment, as
would any callers that read the final result of an in-memory sort -- they
would need to retrieve the tuples starting with the appropriate array
depending on NULLS FIRST/LAST behavior.
>
> Thanks for coming back to this. I've been thinking about this again
> recently due to what was discovered in [1].

That was indeed part of the motivation for bringing this up.

> Why I'm bringing this up here is that I wondered, in addition to what
> you're mentioning above, if we're making some changes to allow
> multiple in-memory arrays, would it be worth going a little further
> and allowing it so we could have N arrays which we'd try to keep under
> L3 cache size. Maybe some new GUC could be used to know what a good
> value is. With fast enough disks, it's often faster to use smaller
> values of work_mem which don't exceed L3 to keep these batches
> smaller. Keeping them in memory would remove the need to write out
> tapes.

That's interesting. I don't know enough to guess how complex it would be to
make "external" merges agnostic about whether the tapes are on disk or in
memory.

If in-memory sorts were designed analogously to external ones,
grow_memtuples would never have to repalloc, it could just allocate a new
tape, which could further shave some cycles.

My hunch in my last email was that having separate groups of tapes for each
null/non-null first sortkey would be complex, because it increases the
number of places that have to know about nulls and their ordering. If we
wanted to go to N arrays instead of 2, and additionally wanted separate
null/non-null treatment, it seems we would still need 2 sets of arrays, one
with N non-null-first-sortkey tapes and one with M null-first-sortkey tapes.

So using 2 arrays seems like the logical first step. I'd be curious to hear
other possible development paths.

> I think the slower sorts I found in [2] could also be partially caused
> by the current sort specialisation comparators re-comparing the
> leading column during a tie-break. I've not gotten around to disabling
> the sort specialisations to see if and how much this is a factor for
> that test.

Right, that's worth addressing independently of the window function
consideration. I'm still swapping this area back in my head, but I believe
one issue is that state->base.onlyKey signals two things: "one sortkey, not
abbreviated". We could add a separate branch for "first key unabbreviated,
nkeys>1" -- I don't think we'd need to specialize, just branch -- and
instead of state->base.comparetup, call a set of analogous functions that
only handle keys 2 and above (comparetup_tail_* ? or possibly just add a
boolean parameter compare_first). That would not pose a huge challenge, I
think, since they're already written like this:

/* Compare the leading sort key */
compare = ApplySortComparator(...);
if (compare != 0)
return compare;

/* Compare additional sort keys */
...

The null/non-null separation would eliminate a bunch of branches in inlined
comparators, so we could afford to add another branch for number of keys.

I haven't thought through either of these ideas in the gory detail, but I
don't yet see any big obstacles.

--
John Naylor
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2023-01-27 06:57:01 RE: [Proposal] Add foreign-server health checks infrastructure
Previous Message Masahiko Sawada 2023-01-27 06:46:04 Re: Improve WALRead() to suck data directly from WAL buffers when possible