Re: NOT IN subquery optimization

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jim Finnerty <jfinnert(at)amazon(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOT IN subquery optimization
Date: 2019-02-21 02:11:06
Message-ID: 8224.1550715066@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jim Finnerty <jfinnert(at)amazon(dot)com> writes:
> re: The idea that's been kicked around in the past is to detect whether the
> subselect's output column(s) can be proved NOT NULL, and if so, convert
> to an antijoin just like NOT EXISTS

> basically, yes. this will handle nullability of both the outer and inner
> correlated expression(s), multiple expressions, presence or absence of
> predicates in the WHERE clause, and whether the correlated expressions are
> on the null-padded side of an outer join. If it is judged to be more
> efficient, then it transforms the NOT IN sublink into an anti-join.

Hmm, that seems overcomplicated ...

> some complications enter into the decision to transform NOT IN to anti-join
> based on whether a bitmap plan will/not be used, or whether it will/not be
> eligible for PQ.

... and that even more so, considering that this decision really needs
to be taken long before cost estimates would be available.

As far as I can see, there should be no situation where we'd not want
to transform to antijoin if we can prove it's semantically valid to
do so. If there are cases where that comes out as a worse plan,
that indicates a costing error that would be something to address
separately (because it'd also be a problem for other antijoin cases).
Also, as long as it nearly always wins, I'm not going to cry too hard
if there are corner cases where it makes the wrong choice. That's not
something that's possible to avoid completely.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2019-02-21 02:17:45 Re: insensitive collations
Previous Message David Rowley 2019-02-21 02:05:14 Re: Performance issue in foreign-key-aware join estimation