Re: [PATCH] Covering SPGiST index

From: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
To: "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
Cc: Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Covering SPGiST index
Date: 2020-08-24 13:19:37
Message-ID: CALT9ZEFScqVk7Q6FeKhTGFgHPT2iQnhTB6Mz-Uitry1xTEJG6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> I'm looking into the patch. I have few notes:
>
> 1. I see that in src/backend/access/spgist/README you describe SP-GiST
> tuple as sequence of {Value, ItemPtr to heap, Included attributes}. Is it
> different from regular IndexTuple where tid is within TupleHeader?
>

Yes, the header of SpGist tuple is put down in a little bit different way
than index tuple. It is also intended to connect spgist leaf tuples in
chains on a leaf page so it already have more complex layout and bigger
size that index tuple header.

SpGist tuple header size is 12 bytes which is a maxaligned value for 32 bit
architectures, and key value can start just after it without any gap. This
is of value, as unnecessary index size increase slows down performance and
is evil anyway. The only part of this which is left now is a gap
between SpGist tuple header and first value on 64 bit architecture (as
maxalign value in this case is 16 bytes and 4 bytes per tuple can be
saved). But I was discouraged to change this on the reason of binary
compatibility with indexes built before and complexity of the change also,
as quite many things in the code do depend on this maxaligned header (for
inner and dead tuples also).

Another difference is that SpGist nulls mask is inserted after the key
value before the first included one and apply only to included values. It
is not needed for key values, as null key values in SpGist are stored in
separate tree, and it is not needed to mark it null second time. Also nulls
mask size in Spgist does depend on the number of included values in a
tuple, unlike in IndexTuple which contains redundant nulls mask for all
possible INDEX_MAX_KEYS. In certain cases we can store nulls mask in free
bytes after key value before typealign of first included value. (E.g. if
key value is varchar (radix tree) statistically we have only 1/8 of keys
finishing exactly an maxalign, the others will have a natural gap for nulls
mask.)

2. Instead of cluttering tuple->nextOffset with bit flags we could just
> change Tuple Header for leaf tuples with covering indexes. Interpret tuples
> for indexes with included attributes differently, iff it makes code
> cleaner. There are so many changes with SGLT_SET_OFFSET\SGLT_GET_OFFSET
> that it seems viable to put some effort into research of other ways to
> represent two bits for null mask and varatts.
>

Of course SpGist header can be done different for index with and without
included columns. I see two reasons against this:
1. It will be needed to integrate many ifs and in many places keep in mind
whether the index contains included values. It is expected to be much more
code than now and not only in the parts which integrates included values to
leaf tuples. I think this vast changes can puzzle reader much more than
just two small macros evenly copy-pasted in the code.
2. I also see no need to increase SpGist tuple size just for inserting two
bits which are now stored free of charge. I consulted with bit flags
storage in IndexTupleData.t_tid and did it in a similar way. Macros for
GET/SET are basically needed to make bit flags and offset modification
independent and safe in any place of a code.

I added some extra comments and mentions in manual to make all the things
clear (see v7 patch)

> 3. Comment "* SPGiST dead tuple: declaration for examining non-live
> tuples" does not precede relevant code. because struct SpGistDeadTupleData
> was not moved.

You are right, thank you! Corrected this and also removed some unnecessary
declarations.

Thank you for your attention to the patch!

Attachment Content-Type Size
v7-0001-Covering-SP-GiST-index-support-for-INCLUDE-column.patch application/octet-stream 76.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2020-08-24 13:24:04 [ASAN] Postgres14 (Windows 64 bits)
Previous Message Ashutosh Bapat 2020-08-24 13:13:36 Re: jsonb, collection & postgres_fdw