Re: Reduce build times of pg_trgm GIN indexes

From: David Geier <geidav(dot)pg(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reduce build times of pg_trgm GIN indexes
Date: 2026-01-14 12:27:05
Message-ID: 8a311d7b-9101-4947-a700-6a7821678462@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Heikki!

On 09.01.2026 22:02, David Geier wrote:
> Given that doing the sort on pre-sorted input apparently doesn't add
> measurable overhead, according to my benchmark results, we can apply
> your patch and leave mine out for the moment being.

I've removed my patch from the patch set in favor of your patch. I've
adapted your patch slightly as follows:

- I replaced the custom for-loop by qunique()
- I switched to sort_template.h which gives a nice speedup as it can do
some things more efficiently now where the entries are simple Datums.
- I use palloc0_array() to initialize the array to GIN_CAT_NORM_KEY in
one go.

Your patch together with my changes gives a 20% speedup on a table with
arrays of 1000 elements and 10% NULLs. See attached test script.

> That's btw. also the reason for why 0002 doesn't show much gain: when
> the data is pre-sorted, cmpEntries() is not called as much.

This turned out to be not the case. I tested 0002 with the attached
script but that neither showed any significant improvements. It's still
curious to me because cmpEntries() is called hundreds of millions of
times and the disassembly shows that the optimized function indeed
directly calls the operator rather than having the indirection via
FunctionCall2Coll().

Anyways, I've removed the patch from the patch set for the moment being.

>>> I would prefer to change qunique() instead. That would enforce using an
>>> adequate comparison function from the get go. There are only ~15 calls
>>> to qunique(). So refactoring this should also be a fairly small patch. I
>>> can do that if there's agreement for that approach.
>>
>> Works for me.
>>
>> At quick glance, most if not all of the qunique() calls call qsort()
>> just before qunique(). I wonder if we should have a single "sort and
>> deduplicate" function, instead. It could perhaps do some deduplication
>> "on the go", or other optimizations.
>
> If it's just for deduplication purposes and the data doesn't have to end
> up sorted, something based on a hash map should be even faster. How
> about we start with changing the qunique() comparator signature and as a
> 2nd step take a closer look at how to go about providing a function that
> does it in one go?

Thinking about this some more: ideally we have two functions: something
like deduplicateArray() and sortAndDeduplicateArray(). We could
initially implement deduplicateArray() on top of
sortAndDeduplicateArray(). If we ever find a case that needs
optimization, and doesn't require the data to actually end up sorted, we
can implement deduplicateArray() e.g. on top of simplehash.h.

I'll draft a patch and submit it in a separate thread.

> What about the other patches? 0003 and 0007 are also pretty simple and
> IMHO uncontroversial while giving decent savings.

I've reordered the patches such that the ones that I think are
uncontroversial, small and ready to be committed are at the beginning
(patches 0001 - 0004). The radix sort and ASCII fast-path patches come
last (0005 and 0006). I would like to first concentrate on getting 0001
- 0004 in and then get back to 0005 and 0006.

I remeasured the savings of 0001 - 0004, which comes on top of the
already committed patch that inlined the comparison function, which gave
another ~5%:

Data set | Patched (ms) | Master (ms) | Speedup
--------------------|--------------|--------------|----------
movies(plot) | 8,058 | 10,311 | 1.27x
lineitem(l_comment) | 223,233 | 256,986 | 1.19x

--
David Geier

Attachment Content-Type Size
table_with_random_int_arrays.sql application/sql 633 bytes
v3-0006-Add-ASCII-fastpath-to-generate_trgm_only.patch text/x-patch 4.0 KB
v3-0005-Optimize-generate_trgm-with-radix-sort.patch text/x-patch 2.9 KB
v3-0004-Faster-qunique-comparator-in-generate_trgm.patch text/x-patch 2.0 KB
v3-0003-Make-btint4cmp-branchless.patch text/x-patch 1.0 KB
v3-0002-Optimize-generate_trgm-with-sort_template.h.patch text/x-patch 2.6 KB
v3-0001-Optimize-sort-and-deduplication-in-ginExtractEntr.patch text/x-patch 7.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Japin Li 2026-01-14 12:40:13 Re: Add IS_INDEX macro to brin and gist index
Previous Message Japin Li 2026-01-14 12:17:18 Re: GIN pageinspect support for entry tree and posting tree