Re: Parallel CREATE INDEX for GIN indexes

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Kirill Reshke <reshkekirill(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel CREATE INDEX for GIN indexes
Date: 2026-01-16 19:07:10
Message-ID: 0dc810dd-6cf1-4789-abd1-46f6332fe778@vondra.me
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/16/26 19:43, Kirill Reshke wrote:
> On Fri, 16 Jan 2026 at 23:03, Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>>
>>
>> On 1/15/26 06:08, Kirill Reshke wrote:
>>> ...
>>>>>
>>>>> I think tuplesort_begin_index_gin() has the same issue. It does this to
>>>>> look up the comparison function:
>>>>>
>>>>> /*
>>>>> * Look for an ordering for the index key data type, and then the sort
>>>>> * support function.
>>>>> */
>>>>> typentry = lookup_type_cache(att->atttypid, TYPECACHE_LT_OPR);
>>>>> PrepareSortSupportFromOrderingOp(typentry->lt_opr, sortKey);
>>>>>
>>>>> That also looks up the key type's b-tree operator rather than the GIN
>>>>> opclass's compare function.
>>>>>
>>>>
>>>> Thanks for noticing this, I'll get this fixed next week.
>>>>
>>>> Funny, you noticed that almost exactly one year after I fixed the other
>>>> incorrect place in the patch ;-)
>>>>
>>
>> The attached 0001 should fix this. I'm wondering what kinds of issues it
>> might cause, and if we need to mention that in release notes. AFAICS it
>> would cause trouble if (a) there's no b-tree opclass, in which case the
>> tuplesort_begin_index_gin() errors out, or (b) the btree/gin opclasses
>> disagree on ordering (or rather equality).
>>
>> AFAIK we haven't heard anything about index builds failing on 18, and
>> with both btree/gin opclasses it seems unlikely they'd define equality
>> differently. Maybe I'm missing something.
>>
>>>
>>>
>>> I was looking at code coverage for GIN indexes [1] and noticed that
>>> Parallel GIN build is not covered in the regression test. Btree
>>> parallel build (_bt_begin_parallel function for example at[0]) is
>>> covered. I did not find any btree-parallel-build dedicated test, looks
>>> like this is covered by accident, from write_paralle, partition_prune
>>> and other regression tests.
>>>
>>> So, maybe add some tests here? Also, I wonder what regression sql file
>>> to use, or maybe create a new one.
>>>
>>
>> Fair point. Someone pinged me about this coverage issue a while back,
>> but it completely slipped my mind. 0002 tweaks two places in regression
>> tests to build the GIN index in parallel. Which for me seems to improve
>> the coverage quite a bit. It's not perfect, because it's still not a lot
>> of data, which means some code is impossible to hit (e.g. it'll not hit
>> trimming).
>>
>> regards
>>
>> --
>> Tomas Vondra
>
>
> Hi! Thank you for posting this.
>
> I have looked at 0001. Added code seems to be 1-1 matching for what
> GinBufferInit is doing, but the commit message refers to initGinState.
> Maybe I did not get it right though...

Well, GinBufferInit got it from initGinState too. I haven't checked
which of those places is "closer", I simply used the original one.

>
> Also, the fact that tuplesort_begin_index_gin is placed not inside
> src/backend/access/gin ... Is it a little awkward? I am not saying
> this is anything worth fixing, but functions like writetup_index_gin
> etc, not being inside `access/gin`, is it a layering violation?
>

There's also tuplesort_begin_index_btree, tuplesort_begin_index_hash,
tuplesort_begin_index_gist and tuplesort_begin_index_brin. I don't think
this is a layering violation, it's simply "create a tuplesort tailored
for a specific index AM". Or from the other direction - we could move
this to individual index AMs, but then the index AMs would need to know
about tuplesort. Which seems strange.

regards

--
Tomas Vondra

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Jones 2026-01-16 19:32:14 Re: [PoC] XMLCast (SQL/XML X025)
Previous Message Kirill Reshke 2026-01-16 18:43:47 Re: Parallel CREATE INDEX for GIN indexes