Re: WIP: Covering + unique indexes.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Andrey Borodin <amborodin(at)acm(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>
Subject: Re: WIP: Covering + unique indexes.
Date: 2016-10-04 14:20:04
Message-ID: CA+TgmoaV+nTh=EFKRbHS1-4onOQ-bXOZtON7cuqn2WpNLeUeTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 4, 2016 at 9:20 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> I'd say that the reason for not using included columns in any
>>> operations which require comparisons, is that they don't have opclass.
>>> Let's go back to the example of points.
>>> This data type don't have any opclass for B-tree, because of fundamental
>>> reasons.
>>> And we can not apply _bt_compare() and others to this attribute, so
>>> we don't include it to scan key.
>>>
>>> create table t (i int, i2 int, p point);
>>> create index idx1 on (i) including (i2);
>>> create index idx2 on (i) including (p);
>>> create index idx3 on (i) including (i2, p);
>>> create index idx4 on (i) including (p, i2);
>>>
>>> You can keep tuples ordered in idx1, but not for idx2, partially ordered for
>>> idx3, but not for idx4.
>>
>> Yeah, I think we shouldn't go there. I mean, once you start ordering
>> by INCLUDING columns, then you're going to need to include them in
>> leaf pages because otherwise you can't actually guarantee that they
>> are in the right order.
>
> I am not sure what you mean by above, because patch already stores
> INCLUDING columns in leaf pages.

Sorry, I meant non-leaf pages.

>> And then you have to wonder why an INCLUDING
>> column is any different from a non-INCLUDING column. It seems best to
>> make a firm rule that INCLUDING columns are there only for the values,
>> not for ordering. That rule is simple and clear, which is a good
>> thing.
>
> Okay, we can make that firm rule, but I think reasoning behind that
> should be clear. As far as I get it by reading some of the mails in
> this thread, it is because some of the other databases doesn't seem to
> support ordering for included columns or supporting the same can
> complicate the code. One point, we should keep in mind that
> suggestion for including many other columns in INCLUDING clause to use
> Index Only scans by other databases might not hold equally good for
> PostgreSQL because it can lead to many HOT updates as non-HOT updates.

Right. Looking back, the originally articulated rationale for this
patch was that you might want a single index that is UNIQUE ON (a) but
also INCLUDING (b) rather than two indexes, a unique index on (a) and
a non-unique index on (a, b). In that case, the patch is a
straight-up win: you get the same number of HOT updates either way,
but you don't use as much disk space, or spend as much CPU time and
WAL updating your indexes.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-10-04 14:43:48 Misidentification of Python shared library
Previous Message Julien Rouhaud 2016-10-04 14:16:46 Re: Rename max_parallel_degree?