Indexes with duplicate columns

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Indexes with duplicate columns
Date: 2011-12-22 17:17:36
Message-ID: 16303.1324574256@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

In bug #6351 it's pointed out that this fails unexpectedly:

CREATE TABLE tab (id SERIAL, a INTEGER, b INTEGER);
CREATE INDEX tab123 ON tab (a, b, a);
SELECT a, b FROM tab WHERE a = 0 AND b = 1;
ERROR: btree index keys must be ordered by attribute

I looked into this a bit and find that indxpath.c is producing a correct
list of index quals, "a = 0 AND b = 1 AND a = 0", but then createplan.c
messes it up because it matches both copies of "a" to the first possible
match in the index's column list. So what the executor gets looks like
"{INDEX_VAR 1} = 0 AND {INDEX_VAR 2} = 1 AND {INDEX_VAR 1} = 0"
and there's a btree implementation restriction that makes it spit up on
that.

Now, what the planner did here is more wrong than just tripping over a
btree limitation. The actual use-case for an index mentioning the same
table column more than once, IMO, would be if the index columns had
different operator classes and thus supported different sets of
indexable operators. Matching the second instance of "a" to the first
index column could then be asking the index to implement an operator
that that column doesn't support. So we need to fix the planner not
btree.

The representation that indxpath.c actually emits is correct and
unambiguous, because it produces a list of sublists of indexquals,
one sublist per index column. So in that format it's clear which
index column each clause is meant for. But then we flatten the list
in create_index_path, and so createplan.c has to reverse-engineer
the matching, and it really can't get it right unless we're willing
to make it recheck operator matching not only column matching.

The obvious solution seems to be to preserve the list-of-sublists
representation through to createplan.c. Then that code can just verify
the expected column match instead of searching, so it might actually be
a tad faster. However such a change is going to affect cost_index and
all the amcostestimate functions, and likely break any planner plugins
that do anything with IndexPaths. So it's going to be a bit invasive
and I'm inclined to think it's not back-patchable.

My inclination is to fix it that way in HEAD and just leave it as a
known unsupported case in the back branches. This case does not seem
important enough to justify trying to find a solution that would work
without a path representation change.

Comments, other ideas?

regards, tom lane

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-12-22 17:18:54 Re: reprise: pretty print viewdefs
Previous Message Stephen Frost 2011-12-22 17:05:53 Re: Wishlist: parameterizable types