Re: WIP: Covering + unique indexes.

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>
Subject: Re: WIP: Covering + unique indexes.
Date: 2018-03-24 02:21:06
Message-ID: CAH2-WzkHH1E7MR08zMKpoHEp8g=0xoTNwCOLt8B_ETP4SeGi=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 22, 2018 at 8:23 AM, Alexander Korotkov
<a(dot)korotkov(at)postgrespro(dot)ru> wrote:
>> * There is minor formatting issue in this part of code. Some spaces need
>> to be replaced with tabs.
>> +IndexTuple
>> +index_truncate_tuple(Relation idxrel, IndexTuple olditup)
>> +{
>> + TupleDesc itupdesc =
>> CreateTupleDescCopyConstr(RelationGetDescr(idxrel));
>> + Datum values[INDEX_MAX_KEYS];
>> + bool isnull[INDEX_MAX_KEYS];
>> + IndexTuple newitup;

The last time I looked at this patch, in April 2017, I made the point
that we should add something like an "nattributes" argument to
index_truncate_tuple() [1], rather than always using
IndexRelationGetNumberOfKeyAttributes() within index_truncate_tuple().
I can see that that change hasn't been made since that time.

With that approach, we can avoid relying on catalog metadata to the
same degree, which was a specific concern that Tom had around that
time. It's easy to do something with t_tid's offset, which is unused
in internal page IndexTuples. We do very similar things in GIN, where
alternative use of an IndexTuple's t_tid supports all kinds of
enhancements, some of which were not originally anticipated. Alexander
surely knows more about this than I do, since he wrote that code.

Having this index_truncate_tuple() "nattributes" argument, and storing
the number of attributes directly improves quite a lot of things:

* It makes diagnosing issues in the field quite a bit easier. Tools
like pg_filedump can do the right thing (Tom mentioned pg_filedump and
amcheck specifically). The nbtree IndexTuple format should not need to
be interpreted in a context-sensitive way, if we can avoid it.

* It lets you use index_truncate_tuple() for regular suffix truncation
in the future. These INCLUDE IndexTuples are really just a special
case of suffix truncation. At least, they should be, because otherwise
an eventual suffix truncation feature is going to be incompatible with
the INCLUDE tuple format. *Not* doing this makes suffix truncation
harder. Suffix truncation is a classic technique, first described by
Bayer in 1977, and we are very probably going to add it someday.

* Once you can tell a truncated IndexTuple from a non-truncated one
with little or no context, you can add defensive assertions in various
places where they're helpful. Similarly, amcheck can use and expect
this as a cross-check against IndexRelationGetNumberOfKeyAttributes().
This will increase confidence in the design, both initially and over
time.

I must say that I am disappointed that nothing has happened here,
especially because this really wasn't very much additional work, and
has essentially no downside. I can see that it doesn't work that way
in the Postgres Pro fork [2], and diverging from that may
inconvenience Postgres Pro, but that's a downside of forking. I don't
think that the community should have to absorb that cost.

> +Notes to Operator Class Implementors
> +------------------------------------
>
> Besides I really appreciate it, it seems to be unrelated to the covering
> indexes.
> I'd like this to be extracted into separate patch and be committed
> separately.

Commit 3785f7ee, from last month, moved the original "Notes to
Operator Class Implementors" section to the SGML docs. It looks like
that README section was accidentally reintroduced during rebasing. The
new information ("Included attributes in B-tree indexes") should be
moved over to the new section of the user docs -- the section that
3785f7ee added.

[1] https://postgr.es/m/CAH2-Wzm9y59h2m6iZjM4fpdUP5r4bsRVzGbN2gTRCO1j4nZmtw@mail.gmail.com
[2] https://github.com/postgrespro/postgrespro/blob/PGPRO9_5/src/backend/access/common/indextuple.c#L451
--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2018-03-24 02:47:39 Re: public schema default ACL
Previous Message Alvaro Herrera 2018-03-24 00:23:40 Re: ON CONFLICT DO UPDATE for partitioned tables