Re: WIP: Covering + unique indexes.

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>
Subject: Re: WIP: Covering + unique indexes.
Date: 2018-03-26 19:57:19
Message-ID: CAH2-Wzmm+nYmpmT_XmSwcNeZPvG0eGbKYMHaSwWkEOs1PAmaEA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 26, 2018 at 3:10 AM, Alexander Korotkov
<a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> So, as I get you're proposing to introduce INDEX_ALT_TID_MASK flag
> which would indicate that we're storing something special in the t_tid
> offset. And that should help us not only for covering indexes, but also for
> further btree enhancements including suffix truncation. What exactly do
> you propose to store into t_tid offset when INDEX_ALT_TID_MASK flag
> is set? Is it number of attributes in this particular index tuple?

Yes. I think that once INDEX_ALT_TID_MASK is available, we should
store the number of attributes in that particular "separator key"
tuple (which has undergone suffix truncation), and always work off of
that. You could then have status bits in offset as follows:

* 1 bit that represents that this is a "separator key" IndexTuple
(high key or internal IndexTuple). Otherwise, it's a leaf IndexTuple
with an ordinary heap TID. (When INDEX_ALT_TID_MASK isn't set, it's
the same as today.)

* 3 reserved bits. I think that one of these bits can eventually be
used to indicate that the internal IndexTuple actually has a
"normalized key" representation [1], which seems like the best way to
do suffix truncation, long term. I think that we should support simple
suffix truncation, of the kind that this patch implements, alongside
normalized key suffix truncation. We need both for various reasons
[2].

Not sure what the other two flag bits might be used for, but they seem
worth having.

* 12 bits for the number of attributes, which should be more than
enough, even when INDEX_MAX_KEYS is significantly higher than 32. A
static assertion can keep this safe when INDEX_MAX_KEYS is set
ridiculously high.

I think that this scheme is future-proof. Maybe you have additional
ideas on the representation. Please let me know what you think.

When we eventually add optimizations that affect IndexTuples on the
leaf level, we can start using the block number (bi_hi + bi_lo)
itself, much like GIN posting lists. No need to further consider that
(the leaf level optimizations) today, because using block number
provides us with many more bits.

In internal page items, the block number is always a block number, so
internal IndexTuples are rather like GIN posting tree pointers in the
main entry tree (its leaf level) -- a conventional item pointer block
number is used, alongside unconventional use of the offset field,
where there are 16 bits available because no real offset is required.

[1] https://wiki.postgresql.org/wiki/Key_normalization#Optimizations_enabled_by_key_normalization
[2] https://wiki.postgresql.org/wiki/Key_normalization#How_big_can_normalized_keys_get.2C_and_is_it_worth_it.3F
--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2018-03-26 19:59:02 Re: Index scan prefetch?
Previous Message Pavel Stehule 2018-03-26 19:51:25 Re: Re: csv format for psql