Re: Correct docs about GiST leaf page structure

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com>
Cc: pgsql-docs(at)lists(dot)postgresql(dot)org
Subject: Re: Correct docs about GiST leaf page structure
Date: 2026-03-30 22:06:01
Message-ID: 2581763.1774908361@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-docs

Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com> writes:
> On Sun, Mar 29, 2026 at 12:32 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Actually I think it's more complicated than that. A GiST opclass
>> can choose whether to compress leaf-key entries, and if it does it
>> can use a different representation than it does on internal pages.
>> You can see that in action in compress/decompress functions that
>> pay attention to the GISTENTRY.leafkey flag, which many do.

> I think your changes are great. I agree about not needing two commits.
> My only hesitation is removing the line about STORAGE. In btree_gist
> we do declare the storage of many opclasses. But I'm not sure why. Is
> it necessary? Does an opclass gain some advantage from it? Does core
> use that information somehow? Especially if leaf keys might or might
> not be the same type as internal keys, I'm not sure what value
> declaring STORAGE can provide. (It must be for core's sake, not the
> opclass's, right?)

Excellent questions, and thanks for holding my feet to the fire about
that ;-). Reading more closely, the STORAGE option does indeed do
something: it determines the declared data type of the index's column,
as stored in pg_attribute. And that's important because GiST uses
that datatype while forming or deforming index tuples. So it has to
be accurate --- but only to the extent of having the right
typlen/typbyval/typalign properties, because that's as much as
index_form_tuple() and related functions really care about. They
don't look into the contents of the entries, except for the length
word if it's typlen -1.

My claim that the leaf key representation can be different from upper
levels is still accurate, but both representations have to match the
typlen/typbyval/typalign properties of whatever type is mentioned
in STORAGE. (I was misled by the fact that the GiST code has
different "leafTupdesc" and "nonLeafTupdesc" tuple descriptors.
But leafTupdesc is just the standard rd_att descriptor made from
the index's pg_attribute entries, and nonLeafTupdesc differs from
it only in having removed any INCLUDE attributes.)

So here's a v3 that accounts for that. I also decided that we were
going in quite the wrong direction by cramming more info into the
summary paragraph early in gist.sgml. The general plan there is to
offer about a one-sentence description of each opclass method, and
then go into more detail as necessary in the per-method text below.
So I moved all this info down into the compress method's section.
This seems to me to read noticeably better.

regards, tom lane

Attachment Content-Type Size
v3-0001-Correct-GiST-documentation-about-compressed-value.patch text/x-diff 4.1 KB

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Tom Lane 2026-03-30 22:26:26 Re: cmax docs seem misleading
Previous Message Hiroki Takamatsu 2026-03-30 08:48:10 Doc: clarify pg_locks descriptions of classid/objid/objsubid