Re: Allowing NOT IN to use ANTI joins

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Marti Raudsepp <marti(at)juffo(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allowing NOT IN to use ANTI joins
Date: 2014-07-15 19:55:58
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> I've made some changes to the patch so that it only allows the conversion
> to ANTI JOIN to take place if both the outer query's expressions AND the
> subquery's target list can be proved not to have NULLs.

This coding doesn't fill me with warm fuzzy feelings.
query_outputs_are_not_nullable, as originally constituted, knew that it
was attempting to prove the query's tlist non-nullable; that's the reason
for the setop restriction, and it also justifies looking at all the
available quals. If you're trying to make a similar proof for expressions
occurring in a random qual clause, I don't think you can safely look at
quals coming from higher syntactic nesting levels. And on the other side
of the coin, outer joins occurring above the syntactic level of the NOT IN
aren't reason to dismiss using an antijoin, because they don't null
variables appearing in it.

It might be possible to fix that by passing in the jointree node at which
the NOT IN is to be evaluated, and doing the find_innerjoined_rels search
for the outer-query exprs from there rather than always from the jointree
root. I've not thought carefully about this though.

> I found that the call to is_NOTANY_compatible_with_antijoin adds about 0.2%
> and 2.3% to total planning time. Though the 2.3% was quite an extreme case,
> and the 0.2% was the most simple case I could think of.

Hm. Since, as you say, the cost is 0 unless there's a NOT IN, that seems
to indicate that we can afford this test ... as long as it does something
often enough to be useful. I'm still a bit doubtful about that. However,
it does give us the option of telling people that they can fix their
queries by adding "WHERE x IS NOT NULL", so maybe that's helpful enough
even if it doesn't fix real-world queries right out of the gate.

Since we're at the end of the June commitfest, I'm going to mark this
patch Returned With Feedback in the commitfest list.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2014-07-15 19:57:38 Re: [COMMITTERS] pgsql: Reset master xmin when hot_standby_feedback disabled.
Previous Message Simon Riggs 2014-07-15 19:43:15 Re: [COMMITTERS] pgsql: Reset master xmin when hot_standby_feedback disabled.