Re: tuple radix sort

From: John Naylor <johncnaylorls(at)gmail(dot)com>
To: David Geier <geidav(dot)pg(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: tuple radix sort
Date: 2025-11-13 04:01:18
Message-ID: CANWCAZbwX5wb+pCU8hTQYonDAqBhQt5ZH7EP8cQL7O107fYoKg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 12, 2025 at 9:28 PM David Geier <geidav(dot)pg(at)gmail(dot)com> wrote:
> I've also been looking into radix sort the last days to accelerate GIN
> index builds. Ordering and removing duplicates requires a fast sort in
> generate_trgm().

If that's the case then I suggest first seeing if dfd8e6c73ee made
things any worse. A simpler possible improvement is to use a similar
normalization step for the chars, if needed, then do the sort and
quinique with a specialization for unsigned chars. (We don't yet
specialize qunique, but that can be remedied). If you're interested,
please start a separate thread for that.

> What would be great is if we could build a generic radix sort
> implementation, similarly to sort_template.h that can be used in other
> places. We would have to think a bit about the interface because instead
> of a comparator we would require some radix extraction callback.

That's moving the goalposts too far IMO. I want to get to a place
where I feel comfortable with the decisions made, and that already
requires a lot of testing. Also, I don't want to risk introducing
abstractions that make future improvements to tuplesort more
cumbersome.

--
John Naylor
Amazon Web Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chao Li 2025-11-13 04:02:42 Re: Suggestion to add --continue-client-on-abort option to pgbench
Previous Message Chao Li 2025-11-13 03:56:22 Re: SQL:2011 Application Time Update & Delete