Re: WIP: Covering + unique indexes.

From: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
To: Peter Geoghegan <pg(at)heroku(dot)com>, David Steele <david(at)pgmasters(dot)net>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Covering + unique indexes.
Date: 2016-03-21 16:53:49
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

19.03.2016 08:00, Peter Geoghegan:
> On Fri, Mar 18, 2016 at 5:15 AM, David Steele <david(at)pgmasters(dot)net> wrote:
>> It looks like this patch should be marked "needs review" and I have done so.
> Uh, no it shouldn't. I've posted an extensive review on the original
> design thread. See CF entry:
> Marked "Waiting on Author".
Thanks to David,
I've missed these letters at first.
I'll answer here.

> * You truncate (remove suffix attributes -- the "included" attributes)
> within _bt_insertonpg():
> - right_item = CopyIndexTuple(item);
> + indnatts = IndexRelationGetNumberOfAttributes(rel);
> + indnkeyatts = IndexRelationGetNumberOfKeyAttributes(rel);
> +
> + if (indnatts != indnkeyatts)
> + {
> + right_item = index_reform_tuple(rel, item, indnatts, indnkeyatts);
> + right_item_sz = IndexTupleDSize(*right_item);
> + right_item_sz = MAXALIGN(right_item_sz);
> + }
> + else
> + right_item = CopyIndexTuple(item);
> ItemPointerSet(&(right_item->t_tid), rbkno, P_HIKEY);
> I suggest that you do this within _bt_insert_parent(), instead, iff
> the original target page is know to be a leaf page. That's where it
> needs to happen for conventional suffix truncation, which has special
> considerations when determining which attributes are safe to truncate
> (or even which byte in the first distinguishing attribute it is okay
> to truncate past)

I agree that _bt_insertonpg() is not right for truncation.
Furthermore, I've noticed that all internal keys are solely the copies
of "High keys" from the leaf pages. Which is pretty logical.
Therefore, if we have already truncated the tuple, when it became a High
key, we do not need the same truncation within _bt_insert_parent() or
any other function.
So the only thing to worry about is the HighKey truncation. I rewrote
the code. Now only _bt_split cares about truncation.

It's a bit more complicated to add it into index creation algorithm.
There's a trick with a "high key".
* We copy the last item on the page into the new page, and then
* rearrange the old page so that the 'last item' becomes its
high key
* rather than a true data item. There had better be at least two
* items on the page already, else the page would be empty of
* data.
* Move 'last' into the high key position on opage

To be consistent with other steps of algorithm ( all high keys must be
truncated tuples), I had to update this high key on place:
delete the old one, and insert truncated high key.
The very same logic I use to truncate posting list of a compressed tuple
in the "btree_compression" patch. [1]
I hope, both patches will be accepted, and then I'll thoroughly merge them .

> * I think the comparison logic may have a bug.
> Does this work with amcheck? Maybe it works with bt_index_check(), but
> not bt_index_parent_check()? I think that you need to make sure that
> _bt_compare() knows about this, too. That's because it isn't good
> enough to let a truncated internal IndexTuple compare equal to a
> scankey when non-truncated attributes are equal.

It is a very important issue. But I don't think it's a bug there.
I've read amcheck sources thoroughly and found that the problem appears at

static bool
invariant_key_less_than_equal_nontarget_offset(BtreeCheckState *state,
Page nontarget, ScanKey key,
OffsetNumber upperbound)
int16 natts = state->rel->rd_rel->relnatts;
int32 cmp;

cmp = _bt_compare(state->rel, natts, key, nontarget, upperbound);

return cmp <= 0;

It uses scankey, made with _bt_mkscankey() which uses only key
attributes, but calls _bt_compare with wrong keysz.
If we wiil use nkeyatts = state->rel->rd_index->relnatts; instead of
natts, all the checks would be passed successfully.

Same for invariant_key_greater_than_equal_offset() and

In my view, it's the correct way to fix this problem, because the caller
is responsible for passing proper arguments to the function.
Of course I will add a check into bt_compare, but I'd rather make it an
assertion (see the patch attached).

I'll add a flag to distinguish regular and truncated tuples, but it will
not be used in this patch. Please, comment, if I've missed something.
As you've already mentioned, neither high keys, nor tuples on internal
pages are using "itup->t_tid.ip_posid", so I'll take one bit of it.

It will definitely require changes in the future works on suffix
truncation or something like that, but IMHO for now it's enough.

Do you have any objections or comments?


Anastasia Lubennikova
Postgres Professional:
The Russian Postgres Company

Attachment Content-Type Size
including_columns_7.0.patch text/x-patch 101.1 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Teodor Sigaev 2016-03-21 16:57:10 Re: [PATCH] we have added support for box type in SP-GiST index
Previous Message Jesper Pedersen 2016-03-21 16:53:33 Re: Speedup twophase transactions