Re: WIP: Covering + unique indexes.

From: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Andrey Borodin <amborodin(at)acm(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>
Subject: Re: WIP: Covering + unique indexes.
Date: 2016-09-21 13:21:52
Message-ID: 8d6b5b13-9c41-bd41-9c86-497e4e6a46ab@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

20.09.2016 08:21, Amit Kapila:
> On Tue, Sep 6, 2016 at 10:18 PM, Anastasia Lubennikova
> <a(dot)lubennikova(at)postgrespro(dot)ru> wrote:
>> 28.08.2016 09:13, Amit Kapila:
>>
>> On Mon, Aug 15, 2016 at 8:15 PM, Anastasia Lubennikova
>> <a(dot)lubennikova(at)postgrespro(dot)ru> wrote:
>>
>>
>> So the patch is correct.
>> We can go further and remove this index_truncate_tuple() call, because
>> the first key of any inner (or root) page doesn't need any key at all.
>>
> Anyway, I think truncation happens if the page is at leaf level and
> that is ensured by check, so I think we can't remove this:
> + if (indnkeyatts != indnatts && P_ISLEAF(pageop))
>
>
> -- I have one more question regarding this truncate high-key concept.
> I think if high key is truncated, then during insertion, for cases
> like below it move to next page, whereas current page needs to be
> splitted.
>
> Assume index on c1,c2,c3 and c2,c3 are including columns.
>
> Actual high key on leaf Page X -
> 3, 2 , 2
> Truncated high key on leaf Page X
> 3
>
> New insertion key
> 3, 1, 2
>
> Now, I think for such cases during insertion if the page X doesn't
> have enough space, it will move to next page whereas ideally, it
> should split current page. Refer function _bt_findinsertloc() for
> this logic.

Thank you again for the review.

The problem seems really tricky, but the answer is simple.
We store included columns unordered. It was mentioned somewhere in
this thread. Let me give you an example:

create table t (i int, p point);
create index on (i) including (p);
"point" data type doesn't have any opclass for btree.
Should we insert (0, '(0,2)') before (0, '(1,1)') or after?
We have no idea what is the "correct order" for this attribute.
So the answer is "it doesn't matter". When searching in index,
we know that only key attrs are ordered, so only them can be used
in scankey. Other columns are filtered after retrieving data.

explain select i,p from t where i =0 and p <@ circle '((0,0),2)';
QUERY PLAN
-------------------------------------------------------------------
Index Only Scan using idx on t (cost=0.14..4.20 rows=1 width=20)
Index Cond: (i = 0)
* Filter: (p <@ '<(0,0),2>'::circle)

*
The same approach is used for included columns of any type, even if
their data types have opclass.

> Is this truncation concept of high key needed for correctness of patch
> or is it just to save space in index? If you need this, then I think
> nbtree/Readme needs to be updated.

Now it's done only for space saving. We never check included attributes
in non-leaf pages, so why store them? Especially if we assume that included
attributes can be quite long.
There is already a note in documentation:

+ It's the same with other constraints (PRIMARY KEY and EXCLUDE).
This can
+ also can be used for non-unique indexes as any columns which
are not required
+ for the searching or ordering of records can be included in the
+ <literal>INCLUDING</> clause, which can slightly reduce the
size of the index,
+ due to storing included attributes only in leaf index pages.

What should I add to README (or to documentation),
to make it more understandable?

> -- I am getting Assertion failure when I use this patch with database
> created with a build before this patch. However, if I create a fresh
> database it works fine. Assertion failure details are as below:
>
> LOG: database system is ready to accept connections
> LOG: autovacuum launcher started
> TRAP: unrecognized TOAST vartag("((bool) 1)", File: "src/backend/access/common/h
> eaptuple.c", Line: 532)
> LOG: server process (PID 1404) was terminated by exception 0x80000003
> HINT: See C include file "ntstatus.h" for a description of the hexadecimal valu
> e.
> LOG: terminating any other active server processes

That is expected behavior, because catalog versions are not compatible.
But I wonder why there was no message about that?
I suppose, that's because CATALOG_VERSION_NO was outdated in my
patch. As well as I know, committer will change it before the commit.
Try new patch with updated value. It should fail with a message about
incompatible versions.

If that is not the reason of your Assertion failure, provide please
more information to reproduce the situation.

> --
> @@ -1260,14 +1262,14 @@ RelationInitIndexAccessInfo(Relation relation)
> * Allocate arrays to hold data
> */
> relation->rd_opfamily = (Oid *)
> - MemoryContextAllocZero(indexcxt, natts * sizeof(Oid));
> + MemoryContextAllocZero(indexcxt, indnkeyatts * sizeof(Oid));
> relation->rd_opcintype = (Oid *)
> - MemoryContextAllocZero(indexcxt, natts * sizeof(Oid));
> + MemoryContextAllocZero(indexcxt, indnkeyatts * sizeof(Oid));
>
> amsupport = relation->rd_amroutine->amsupport;
> if (amsupport > 0)
> {
> - int nsupport = natts * amsupport;
> + int nsupport = indnatts * amsupport;
>
> relation->rd_support = (RegProcedure *)
> MemoryContextAllocZero(indexcxt, nsupport * sizeof(RegProcedure));
> @@ -1281,10 +1283,10 @@ RelationInitIndexAccessInfo(Relation relation)
> }
>
> relation->rd_indcollation = (Oid *)
> - MemoryContextAllocZero(indexcxt, natts * sizeof(Oid));
> + MemoryContextAllocZero(indexcxt, indnatts * sizeof(Oid));
>
> Can you add a comment in above code or some other related place as to
> why you need some attributes in relcache entry of size indnkeyatts and
> others of size indnatts?

Done. I hope that's enough.
The same logic is used in DefineIndex(), that already has comments.

> --
> @@ -63,17 +63,26 @@ _bt_mkscankey(Relation rel, IndexTuple itup)
> {
> ScanKey skey;
> TupleDesc itupdesc;
> - int natts;
> + int indnatts,
> + indnkeyatts;
> int16 *indoption;
> int i;
>
> itupdesc = RelationGetDescr(rel);
> - natts = RelationGetNumberOfAttributes(rel);
> + indnatts = IndexRelationGetNumberOfAttributes(rel);
> + indnkeyatts = IndexRelationGetNumberOfKeyAttributes(rel);
> indoption = rel->rd_indoption;
>
> - skey = (ScanKey) palloc(natts * sizeof(ScanKeyData));
> + Assert(indnkeyatts != 0);
> + Assert(indnkeyatts <= indnatts);
>
> Here I think you need to declare indnatts as PG_USED_FOR_ASSERTS_ONLY,
> otherwise it will give warning on some platforms.
Fixed. Thank you for advice, I didn't know about this macro before.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
including_columns_9.7_v3.patch text/x-patch 136.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-09-21 13:23:34 Re: Tracking wait event for latches
Previous Message Peter Eisentraut 2016-09-21 13:04:08 Re: Logical Replication WIP