Re: WIP: Covering + unique indexes.

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Covering + unique indexes.
Date: 2016-01-07 21:12:37
Message-ID: CAKJS1f9W0aB-g7H6yYgNBq7hJsOKF3UwHU7-Q5jobbaTyK9f4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7 January 2016 at 06:36, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:

> On Tue, Jan 5, 2016 at 11:55 PM, David Rowley
> <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> > create table ab (a int,b int);
> > insert into ab select x,y from generate_series(1,20) x(x),
> > generate_series(10,1,-1) y(y);
> > create index on ab (a) including (b);
> > explain select * from ab order by a,b;
> > QUERY PLAN
> > ----------------------------------------------------------
> > Sort (cost=10.64..11.14 rows=200 width=8)
> > Sort Key: a, b
> > -> Seq Scan on ab (cost=0.00..3.00 rows=200 width=8)
> > (3 rows)
>
> If you set enable_sort=off, then you get the index-only scan with no
> sort. So it believes the index can be used for ordering (correctly, I
> think), just sometimes it thinks it is not faster to do it that way.
>
> I'm not sure why this would be a correctness problem. The covered
> column does not participate in uniqueness checks, but it still usually
> participates in index ordering. (That is why dummy op-classes are
> needed if you want to include non-sortable-type columns as being
> covered.)
>

If that's the case, then it appears that I've misunderstood INCLUDING. From
reading _bt_doinsert() it appeared that it'll ignore the INCLUDING columns
and just find the insert position based on the key columns. Yet that's not
the way that it appears to work. I was also a bit confused, as from working
with another database which has very similar syntax to this, that one only
includes the columns to allow index only scans, and the included columns
are not indexed, therefore can't be part of index quals and the index only
provides a sorted path for the indexed columns, and not the included
columns.

Saying that, I'm now a bit confused to why the following does not produce 2
indexes which are the same size:

create table t1 (a int, b text);
insert into t1 select x,md5(random()::text) from generate_series(1,1000000)
x(x);
create index t1_a_inc_b_idx on t1 (a) including (b);
create index t1_a_b_idx on t1 (a,b);
select pg_relation_Size('t1_a_b_idx'),pg_relation_size('t1_a_inc_b_idx');
pg_relation_size | pg_relation_size
------------------+------------------
59064320 | 58744832
(1 row)

Also, if we want INCLUDING() to mean "uniqueness is not enforced on these
columns, but they're still in the index", then I don't really think
allowing types without a btree opclass is a good idea. It's likely too
surprised filled and might not be what the user actually wants. I'd suggest
that these non-indexed columns would be better defined by further expanding
the syntax, the first (perhaps not very good) thing that comes to mind is:

create unique index idx_name on table (unique_col) also index
(other,idx,cols) including (leaf,onlycols);

Looking up thread, I don't think I was the first to be confused by this.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2016-01-07 21:28:02 Re: checkpointer continuous flushing
Previous Message Jim Nasby 2016-01-07 20:32:36 Re: No Issue Tracker - Say it Ain't So!