Re: tuple radix sort

From: David Geier <geidav(dot)pg(at)gmail(dot)com>
To: John Naylor <johncnaylorls(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: tuple radix sort
Date: 2025-11-12 14:28:37
Message-ID: a2ca34fd-7693-4406-9250-2f3435353001@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 29.10.2025 07:28, John Naylor wrote:
>
> Next steps: Try to find regressions (help welcome here). The v1 patch
> has some optimizations, but in other ways things are simple and/or
> wasteful. Exactly how things fit together will be informed by what, if
> anything, has to be done to avoid regressions. I suspect the challenge
> will be multikey sorts when the first key has low cardinality. This is
> because tiebreaks are necessarily postponed rather than taken care of
> up front. I'm optimistic, since low cardinality cases can be even
> faster than our B&M qsort, so we have some headroom:

Hi John,

I've also been looking into radix sort the last days to accelerate GIN
index builds. Ordering and removing duplicates requires a fast sort in
generate_trgm(). My own implementation (likely slower than the
algorithms you used) also showed a decent speedup.

Beyond that there are many more places in the code base that could be
changed to use radix sort instead of qsort.

What would be great is if we could build a generic radix sort
implementation, similarly to sort_template.h that can be used in other
places. We would have to think a bit about the interface because instead
of a comparator we would require some radix extraction callback.

If you're open to that idea I could give abstracting the code a try.

--
David Geier

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2025-11-12 14:35:58 Re: [PATCH] Add hints for invalid binary encoding names in encode/decode functions
Previous Message Chao Li 2025-11-12 14:27:22 Re: alignas (C11)