Re: WIP: Covering + unique indexes.

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Covering + unique indexes.
Date: 2016-01-31 08:04:42
Message-ID: CAKJS1f__-i3eOc8pD6WFODDFvPrMBhe4o3KE0kokFWy+AQrStQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 27 January 2016 at 03:35, Anastasia Lubennikova
<a(dot)lubennikova(at)postgrespro(dot)ru> wrote:
> including_columns_3.0 is the latest version of patch.
> And changes regarding the previous version are attached in a separate patch.
> Just to ease the review and debug.

Hi,

I've made another pass over the patch. There's still a couple of
things that I think need to be looked at.

Do we need the "b (included)" here? The key is (a) = (1). Having
irrelevant details might be confusing.

postgres=# create table a (a int not null, b int not null);
CREATE TABLE
postgres=# create unique index on a (a) including(b);
CREATE INDEX
postgres=# insert into a values(1,1);
INSERT 0 1
postgres=# insert into a values(1,1);
ERROR: duplicate key value violates unique constraint "a_a_b_idx"
DETAIL: Key (a, b (included))=(1, 1) already exists.

Extra tabs:
/* Truncate nonkey attributes when inserting on nonleaf pages. */
if (rel->rd_index->indnatts != rel->rd_index->indnkeyatts
&& !P_ISLEAF(lpageop))
{
itup = index_reform_tuple(rel, itup,
rel->rd_index->indnatts, rel->rd_index->indnkeyatts);
}

In index_reform_tuple() I find it a bit scary that you change the
TupleDesc's number of attributes then set it back again once you're
finished reforming the shortened tuple.
Maybe it would be better to modify index_form_tuple() to accept a new
argument with a number of attributes, then you can just Assert that
this number is never higher than the number of attributes in the
TupleDesc.

I'm also not that keen on index_reform_tuple() in general. I wonder if
there's a way we can just keep the Datum/isnull arrays a bit longer,
and only form the tuple when needed. I've not looked into this in
detail, but it does look like reforming the tuple is not going to be
cheap.

If we do need to keep this function, I think a better name might be
index_trim_tuple() and I don't think you need to pass the original
length. It might make sense to Assert() that the trim length is
smaller than the tuple size

What statement will cause this:

numberOfKeyAttributes = list_length(stmt->indexParams);
if (numberOfKeyAttributes <= 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("must specify at least one key column")));

I seem to just get errors from the parser when trying.

Much of this goes over 80 chars:

/*
* We append any INCLUDING columns onto the indexParams list so that
* we have one list with all columns. Later we can determine which of these
* are key columns, and which are just part of the INCLUDING list by
check the list
* position. A list item in a position less than ii_NumIndexKeyAttrs is part of
* the key columns, and anything equal to and over is part of the
* INCLUDING columns.
*/
stmt->indexParams = list_concat(stmt->indexParams, stmt->indexIncludingParams);

in gistrescan() there is some code:

for (attno = 1; attno <= natts; attno++)
{
TupleDescInitEntry(so->giststate->fetchTupdesc, attno, NULL,
scan->indexRelation->rd_opcintype[attno - 1],
-1, 0);
}

Going by RelationInitIndexAccessInfo() rd_opcintype[] is allocated to
be sized by the number of key columns, but this loop goes over the
number of attribute columns.
Perhaps this is not a big problem since GIST does not support
INCLUDING columns, but it does seem wrong still.

Which brings me to the fact that I've spent a bit of time trying to
look for places where you've forgotten to change natts to nkeyatts. I
did find this one, but I don't have much confidence that there's not
lots more places that have been forgotten. Apart from this one, how
confident are you that you've found all the places? I'm getting
towards being happy with the code that I see that's been changed, but
I'm hesitant to mark as "Ready for committer" due to not being all
that comfortable that all the code that needs to be updated has been
updated. I'm not quite sure of a good way to find all these places. I
wondering if hacking the code so that each btree index which is
created with > 1 column puts all but the first column into the
INCLUDING columns, then run the regression tests to see if there are
any crashes. I'm really not that sure of how else to increase the
confidence levels on this. Do you have ideas?

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-01-31 08:07:17 Re: custom function for converting human readable sizes to bytes
Previous Message Konstantin Knizhnik 2016-01-31 07:54:35 Re: Optimizer questions