Re: SP-GiST confusion: indexed column's type vs. index column type

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
Subject: Re: SP-GiST confusion: indexed column's type vs. index column type
Date: 2021-04-03 02:05:02
Message-ID: 3935848.1617415502@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Also, both the code and docs thought that the "reconstructedValue"
> datums that are passed down the tree during a search should be of
> the leaf data type. This seems to me to be arrant nonsense.
> As an example, if you made an opclass that indexes 1-D arrays
> by labeling each inner node with successive array elements,
> right down to the leaf which is the last array element, it will
> absolutely not work for the reconstructedValues to be of the
> leaf type --- they have to be of the array type. (As I said
> in commit 1ebdec8c0, this'd be a fairly poorly-chosen opclass
> design, but it seems like it ought to physically work.)

So after trying to build an opclass that does that, I have a clearer
understanding of why opclasses that'd break the existing code are
so thin on the ground. You can't do the above, because the opclass
cannot force the AM to add inner nodes that it doesn't want to.
For example, the first few index entries will simply be dumped into
the root page as undifferentiated leaf tuples. This means that,
if you'd like to be able to return reconstructed index entries, the
leaf data type *must* be able to hold all the data that is in an
input value. In principle you could represent it in some other
format, but the path of least resistance is definitely to make the
leaf type the same as the input.

I still want to make an opclass in which those types are different,
if only for testing purposes, but I'm having a hard time coming up
with a plan that's not totally lame. Best idea I can think of is
to wrap the input in a bytea, which just begs the question "why
would you do that?". Anybody have a less lame thought?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhihong Yu 2021-04-03 02:06:08 Re: Have I found an interval arithmetic bug?
Previous Message vignesh C 2021-04-03 01:38:57 Re: Data type correction in pgstat_report_replslot function parameters