Re: why doesn't optimizer can pull up where a > ( ... )

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: why doesn't optimizer can pull up where a > ( ... )
Date: 2019-11-20 17:25:37
Message-ID: 20191120172537.w6qx3pjvcsd4rhdi@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 20, 2019 at 11:12:56AM -0500, Tom Lane wrote:
>Daniel Gustafsson <daniel(at)yesql(dot)se> writes:
>>> On 20 Nov 2019, at 13:15, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
>>> 2. why pg can't do it, while greenplum can?
>
>> It's worth noting that Greenplum, the example you're referring to, is using a
>> completely different query planner, and different planners have different
>> characteristics and capabilities.
>
>Yeah. TBH, I think the described transformation is well out of scope
>for what PG's planner tries to do. Greenplum is oriented to use-cases
>where it might be worth spending lots of planner cycles looking for
>optimizations like this one, but in a wider environment it's much
>harder to make the argument that this would be a profitable use of
>planner effort.

True.

>I'm content to say that the application should have written the query
>with a GROUP BY to begin with.
>

I'm not sure I agree with that. The problem is this really depends on
the number of rows that will need the subquery result (i.e. based on
selectivity of conditions in the outer query). For small number of rows
it's fine to execute the subplan repeatedly, for large number of rows
it's better to rewrite it to the GROUP BY form. It's hard to make those
judgements in the application, I think.

>Having said that, the best form of criticism is a patch. If somebody
>actually wrote the code to do something like this, we could look at how
>much time it wasted in which unsuccessful cases and then have an
>informed discussion about whether it was worth adopting.
>

Right.

>(BTW, I do not think the transformation as described is even formally
>correct, at least not without some unstated assumptions. How is it
>okay to push down the "p_size > 40" condition into the subquery? The
>aggregation in the original query will include rows where that isn't
>true.)

Yeah. I think the examples are a bit messed up, and surely there are
other restrictions on applicability of this optimization.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-11-20 17:27:56 Re: Role membership and DROP
Previous Message Tomas Vondra 2019-11-20 17:20:01 Re: why doesn't optimizer can pull up where a > ( ... )