Re: [PATCH] Covering SPGiST index

From: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Covering SPGiST index
Date: 2020-11-17 17:19:52
Message-ID: CALT9ZEG1FOPVM0ewfgw13VgQ7RwjxrpoEbY9psRBexQgPH4q5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

вт, 17 нояб. 2020 г. в 11:36, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>:

> I've started to review this, and I've got to say that I really disagree
>> with this choice:
>>
>> + * If there are INCLUDE columns, they are stored after a key value, each
>> + * starting from its own typalign boundary. Unlike IndexTuple, first
>> INCLUDE
>> + * value does not need to start from MAXALIGN boundary, so SPGiST uses
>> private
>> + * routines to access them.
>>
> Tom, I took a stab at making the code for tuple creation/decomposition
more optimal. Now I see several options for this:
1. Included values can be added after key value as a whole index tuple. Pro
of this: it reuses existing code perfectly. Con is that it will introduce
extra (empty) index tuple header.
2. Existing state: pro is that in my opinion, it has the least possible
gaps. The con is the need to duplicate much of the existing code with some
modification. Frankly I don't like this duplication very much even if it is
only a private spgist code.
2A. Existing state can be shifted into fewer changes in index_form_tuple
and index_deform_tuple if I shift the null mask after the tuple header and
before the key value (SpGistTupleHeader+nullmask chunk will be maxaligned).
This is what I proposed in the previous answer. I tried to work on this
variant but it will need to duplicate index_form_tuple and
index_deform_tuple code into private version. The reason is that spgist
tuple has its own header of different size and in my understanding, it can
not be incorporated using index_form_tuple.
3. I can split index_form_tuple into two parts: a) header adding and size
calculation, b) filling attributes. External (a), which could be
constructed differently for SpGist, and internal (b) being universal.
3A. I can make index_form_tuple accept pointer as an argument to create
tuple in already palloced memory area (with the shift to its start). So
external caller will be able to incorporate headers after calling
index_form_tuple routine.

Maybe there is some other way I don't imagine yet. Which way do you think
for me better to follow? What is your advice?

> I also find it unacceptable that you stuck a tuple descriptor pointer into
>> the rd_amcache structure. The relcache only supports that being a flat
>> blob of memory. I see that you tried to hack around that by having
>> spgGetCache reconstruct a new tupdesc every time through, but (a) that's
>> actually worse than having no cache at all, and (b) spgGetCache doesn't
>> really know much about the longevity of the memory context it's called in.
>> This could easily lead to dangling tuple pointers, serious memory bloat
>> from repeated tupdesc construction, or quite possibly both. Safer would
>> be to build the tupdesc during initSpGistState(), or maybe just make it
>> on-demand. In view of the previous point, I'm also wondering if there's
>> any way to get the relcache's regular rd_att tupdesc to be useful here,
>> so we don't have to build one during scans at all.
>>
> I see that FormData_pg_attribute's inside TupleDescData are situated in a
single memory chunk. If I add it at the ending of allocated SpGistCache as
a copy of this chunk (using memcpy), not a pointer to it as it is now, will
it be safe for use?
Or maybe it would still bel better to initialize tuple descriptor any
time initSpGistState called without trying to cache it?

What will you advise?

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2020-11-17 17:47:02 Re: Deleting older versions in unique indexes to avoid page splits
Previous Message Victor Yegorov 2020-11-17 17:19:46 Re: Deleting older versions in unique indexes to avoid page splits