Re: Get more from indices.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp, robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Get more from indices.
Date: 2014-04-15 22:10:17
Message-ID: 5212.1397599817@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> writes:
> [ pathkey_and_uniqueindx_v10_20130411.patch ]

I thought some more about this patch, and realized that it's more or less
morally equivalent to allowing references to ungrouped variables when the
query has a GROUP BY clause listing all the columns of the primary key.
In that case the parser is effectively pretending that the GROUP BY list
contains additional implicit entries that are functionally dependent on
the entries that are actually there. In this patch, what we want to do
is recognize that trailing entries in an ORDER BY list are semantically
no-ops and can be ignored because they are functionally dependent on
earlier entries.

Now, the reason that the parser restricts the functional dependency
deduction to a primary key is that it wants to be able to identify a
constraint OID that the query is dependent on to be semantically valid.
In this case, we don't need such an OID, so just finding any old unique
index on not-null columns is good enough. (If someone drops the index,
the optimization might become incorrect, but that would force replanning
anyway.)

However, this way of thinking about it shows that the patch is missing
possible optimizations. If we have "ORDER BY a, b, c" and (a,b) is the
primary key, then including c in the ORDER BY list is semantically
redundant, *whether or not we use an indexscan on the pkey index at all*.
More: if we have "ORDER BY a, b, c" and the primary key is (b,a), we
can still discard c from the sort requirement, even though the pkey
index as such isn't helpful for producing the required order.

So hacking up the pathkeys attributed to the indexscan is the wrong thing.
Rather, what we should be looking to do is decide that c is a useless
pathkey and remove it from the query_pathkeys, much as we'd do if we found
"c = constant" in WHERE. That would allow optimization of other query
plans besides scan-the-pkey-index plans.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas 'ads' Scherbaum 2014-04-15 22:19:08 Patch: iff -> if
Previous Message Hannu Krosing 2014-04-15 21:55:08 Question about optimising (Postgres_)FDW