Re: Optimizer questions

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Subject: Re: Optimizer questions
Date: 2016-03-09 12:29:57
Message-ID: 56E01745.60503@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09.03.2016 09:15, Tom Lane wrote:
> I wrote:
>> BTW, there's some additional refactoring I had had in mind to do in
>> grouping_planner to make its handling of the targetlist a bit more
>> organized; in particular, I'd like to see it using PathTarget
>> representation more consistently throughout the post-scan-join steps.
> See 51c0f63e4d76a86b44e87876a6addcfffb01ec28 --- I think this gets
> things to where we could plug in additional levels of targets without
> too much complication.
>
> regards, tom lane

So, if I correctly understand you, there are two major concerns:

1. Volatile functions. I wonder if it is really required to reevaluate
volatile function for each record even if LIMIT clause is present?
Documentation says:
"A query using a volatile function will re-evaluate the function at
every row where its value is needed."
So if we are using sort with limit and value of function is not used in
sort, then we it is correct to say that value of this function is no
needed, so there is no need to re-evaluate it, isn't it?

2. Narrowing functions, like md5.
Here I do not have any good idea how to support it. Looks like cost of
SORT should depend on tuple width. Only in this case optimizer can
determine whether it is more efficient to evaluate function earlier or
postpone its execution.

I think that the best approach is to generate two different paths:
original one, when projection is always done before sort and another one
with postponed projection of non-trivial columns. Then we compare costs
of two paths and choose the best one.
Unfortunately, I do not understand now how to implement it with existed
grouping_planner.
Do you think that it is possible?

Alternative approach is to do something like in my proposed patch, but
take in account cost of function execution and check presence of
volatile/narrowing functions. This approach provides better flexibility,
because we can choose subset of columns not-used in sort, which
evaluation should be postponed. But here we once again make local
decision while construction of the path instead of comparing costs of
full paths.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2016-03-09 12:30:22 Re: WIP: Upper planner pathification
Previous Message Alvaro Herrera 2016-03-09 12:27:10 Re: WIP: Access method extendability