Quick Links

Re: Considering additional sort specialisation functions for PG16

From:	John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To:	David Rowley <dgrowleyml(at)gmail(dot)com>
Cc:	PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject:	Re: Considering additional sort specialisation functions for PG16
Date:	2023-01-26 10:29:25
Message-ID:	CAFBsxsH=3UA+8Sp2sgUs=piYOmR2CQ-Rq5sUhCX-M3Gqv4TRog@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Aug 23, 2022 at 1:13 PM John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
wrote:
>
> On Tue, Aug 23, 2022 at 11:24 AM David Rowley <dgrowleyml(at)gmail(dot)com>
wrote:
> >
> > On Tue, 23 Aug 2022 at 15:22, John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
wrote:
> > > Did you happen to see
> > >
> > >
https://www.postgresql.org/message-id/CAFBsxsFhq8VUSkUL5YO17cFXbCPwtbbxBu%2Bd9MFrrsssfDXm3Q%40mail.gmail.com
> >
> > I missed that. It looks like a much more promising idea than what I
> > came up with. I've not looked at your code yet, but I'm interested and
> > will aim to look soon.
>
> Note that I haven't actually implemented this idea yet, just tried to
> model the effects by lobotomizing the current comparators. I think
> it's worth pursuing and will try to come back to it this cycle, but if
> you or anyone else wants to try, that's fine of course.

Coming back to this, I wanted to sketch out this idea in a bit more detail.

Have two memtuple arrays, one for first sortkey null and one for first
sortkey non-null:
- Qsort the non-null array, including whatever specialization is available.
Existing specialized comparators could ignore nulls (and their ordering)
taking less space in the binary.
- Only if there is more than one sort key, qsort the null array. Ideally at
some point we would have a method of ignoring the first sortkey (this is an
existing opportunity that applies elsewhere as well).
- To handle two arrays, grow_memtuples() would need some adjustment, as
would any callers that read the final result of an in-memory sort -- they
would need to retrieve the tuples starting with the appropriate array
depending on NULLS FIRST/LAST behavior.

I believe external merges wouldn't have to do anything different, since
when writing out the tapes, we read from the arrays in the right order.

(One could extend this idea further and have two pools of tapes for null
and non-null first sortkey, that are merged separately, in the right order.
That sounds like quite a bit more complexity than is worth, however.)

--
John Naylor
EDB: http://www.enterprisedb.com

In response to

Re: Considering additional sort specialisation functions for PG16 at 2022-08-23 06:13:13 from John Naylor

Responses

Re: Considering additional sort specialisation functions for PG16 at 2023-01-26 11:13:24 from Pavel Borisov
Re: Considering additional sort specialisation functions for PG16 at 2023-01-26 12:14:57 from David Rowley
Re: Considering additional sort specialisation functions for PG16 at 2023-02-16 11:46:50 from John Naylor

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andrei Zubkov	2023-01-26 11:02:43	Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements
Previous Message	Dean Rasheed	2023-01-26 10:27:58	Re: to_hex() for negative inputs