Re: EXISTS clauses not being optimized in the face of 'one time pass' optimizable expressions

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: EXISTS clauses not being optimized in the face of 'one time pass' optimizable expressions
Date: 2016-07-01 16:02:38
Message-ID: 20160701160238.GA21416@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom, all,

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > On Tue, Jun 21, 2016 at 4:18 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> >> explain analyze select * from foo where false or exists (select 1 from
> >> bar where good and foo.id = bar.id); -- A
> >> explain analyze select * from foo where exists (select 1 from bar
> >> where good and foo.id = bar.id); -- B
> >>
> >> These queries are trivially verified as identical but give very different plans.
>
> > Right. I suspect wouldn't be very hard to notice the special case of
> > FALSE OR (SOMETHING THAT MIGHT NOT BE FALSE) but I'm not sure that's
> > worth optimizing by itself.
>
> Constant-folding will get rid of the OR FALSE (as well as actually-useful
> variants of this example). The problem is that that doesn't happen till
> after we identify semijoins. So the second one gives you a semijoin plan
> and the first doesn't. This isn't especially easy to improve. Much of
> the value of doing constant-folding would disappear if we ran it before
> subquery pullup + join simplification, because in non-stupidly-written
> queries those are what expose the expression simplification opportunities.
> We could run it twice but that seems certain to be a dead loser most of
> the time.

While it might be a loser most of the time to run it twice, I have to
agree that it's pretty unfortunate that we don't handle this case in a
more sane way. I looked a bit into pull_up_sublinks() and it doens't
look like there's an easy way to realize this case there without going
through the full effort of constant-folding.

One approach that I'm wondering about is to do constant folding first
and then track if we introduce a case where additional constant folding
might help and only perform it again in those cases.

Thanks!

Stephen

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2016-07-01 16:06:11 Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .
Previous Message Merlin Moncure 2016-07-01 16:00:58 Re: EXISTS clauses not being optimized in the face of 'one time pass' optimizable expressions