Re: Get more from indices.

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp, robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Get more from indices.
Date: 2014-04-18 08:46:44
Message-ID: 20140418.174644.24190222.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

> I thought some more about this patch, and realized that it's more or less
> morally equivalent to allowing references to ungrouped variables when the
> query has a GROUP BY clause listing all the columns of the primary key.
> In that case the parser is effectively pretending that the GROUP BY list
> contains additional implicit entries that are functionally dependent on
> the entries that are actually there. In this patch, what we want to do
> is recognize that trailing entries in an ORDER BY list are semantically
> no-ops and can be ignored because they are functionally dependent on
> earlier entries.

Ah, that sounds smarter than extending pathekys. I feel it preferable.

> Now, the reason that the parser restricts the functional dependency
> deduction to a primary key is that it wants to be able to identify a
> constraint OID that the query is dependent on to be semantically valid.
> In this case, we don't need such an OID, so just finding any old unique
> index on not-null columns is good enough. (If someone drops the index,
> the optimization might become incorrect, but that would force replanning
> anyway.)

Agreed,

> However, this way of thinking about it shows that the patch is missing
> possible optimizations. If we have "ORDER BY a, b, c" and (a,b) is the
> primary key, then including c in the ORDER BY list is semantically
> redundant, *whether or not we use an indexscan on the pkey index at all*.
> More: if we have "ORDER BY a, b, c" and the primary key is (b,a), we
> can still discard c from the sort requirement, even though the pkey
> index as such isn't helpful for producing the required order.

Hmm yes, it really seems expectable.

> So hacking up the pathkeys attributed to the indexscan is the wrong thing.
> Rather, what we should be looking to do is decide that c is a useless
> pathkey and remove it from the query_pathkeys, much as we'd do if we found
> "c = constant" in WHERE. That would allow optimization of other query
> plans besides scan-the-pkey-index plans.

Ok, I am convinced that your suggestion - truncating
query_pathkeys by removing eventually no-op entries - seems
preferable and will have wider effect naturally - more promised.

I won't persist with the way this patch currently does but the
new patch of course can't come up within this CF. I will agree if
you decide to make this patch 'Returned with Feedback'. (I feel a
little sad for 'Rejected' but you can justly do that if you think
that the patch comming up next is utterly different from this
one:()

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2014-04-18 10:24:09 Typo fix in src/backend/access/transam/recovery.conf.sample
Previous Message Etsuro Fujita 2014-04-18 08:43:42 Minor improvement in gin_private.h