Re: WIP: Covering + unique indexes.

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Andrey Borodin <amborodin(at)acm(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>
Subject: Re: WIP: Covering + unique indexes.
Date: 2016-10-04 13:20:35
Message-ID: CAA4eK1KnxpoiNNm-cpXEkvOwYxmioeZ5X7_yJU3cVcv6dQFMgA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 27, 2016 at 7:51 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, Sep 26, 2016 at 11:17 AM, Anastasia Lubennikova
> <a(dot)lubennikova(at)postgrespro(dot)ru> wrote:
>>> Is there any fundamental problem in storing them in ordered way? I
>>> mean to say, you need to anyway store all the column values on leaf
>>> page, so why can't we find the exact location for the complete key.
>>> Basically use truncated key to reach to leaf level and then use the
>>> complete key to find the exact location to store the key. I might be
>>> missing some thing here, but if we can store them in ordered fashion,
>>> we can use them even for queries containing ORDER BY (where ORDER BY
>>> contains included columns).
>>
>> I'd say that the reason for not using included columns in any
>> operations which require comparisons, is that they don't have opclass.
>> Let's go back to the example of points.
>> This data type don't have any opclass for B-tree, because of fundamental
>> reasons.
>> And we can not apply _bt_compare() and others to this attribute, so
>> we don't include it to scan key.
>>
>> create table t (i int, i2 int, p point);
>> create index idx1 on (i) including (i2);
>> create index idx2 on (i) including (p);
>> create index idx3 on (i) including (i2, p);
>> create index idx4 on (i) including (p, i2);
>>
>> You can keep tuples ordered in idx1, but not for idx2, partially ordered for
>> idx3, but not for idx4.
>
> Yeah, I think we shouldn't go there. I mean, once you start ordering
> by INCLUDING columns, then you're going to need to include them in
> leaf pages because otherwise you can't actually guarantee that they
> are in the right order.
>

I am not sure what you mean by above, because patch already stores
INCLUDING columns in leaf pages.

> And then you have to wonder why an INCLUDING
> column is any different from a non-INCLUDING column. It seems best to
> make a firm rule that INCLUDING columns are there only for the values,
> not for ordering. That rule is simple and clear, which is a good
> thing.
>

Okay, we can make that firm rule, but I think reasoning behind that
should be clear. As far as I get it by reading some of the mails in
this thread, it is because some of the other databases doesn't seem to
support ordering for included columns or supporting the same can
complicate the code. One point, we should keep in mind that
suggestion for including many other columns in INCLUDING clause to use
Index Only scans by other databases might not hold equally good for
PostgreSQL because it can lead to many HOT updates as non-HOT updates.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2016-10-04 14:16:46 Re: Rename max_parallel_degree?
Previous Message Geoff Winkless 2016-10-04 13:17:17 Re: pnstrdup considered armed and dangerous