Re: Optimization rules for semi and anti joins

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Optimization rules for semi and anti joins
Date: 2009-02-12 00:33:46
Message-ID: 23867.1234398826@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> A6. (A antijoin B on (Pab)) leftjoin C on (Pbc)
>> = A antijoin (B leftjoin C on (Pbc)) on (Pab)

> How do you get the first form as a starting point?

Not sure if you can in SQL, but the point of the identity is you can
apply the transformation in either direction. Consider this version
of the second form:

select ... from A
where not exists(select 1 from B left join C on B.y = C.y where B.x = A.x)

The identity says that if B.y = C.y is strict we can antijoin A to B
first (because, in fact, the join to C is pointless here).

Anyway, whether the identity is really useful for antijoins isn't what
I'm concerned about --- I was just trying to see if it was okay for the
planner's join ordering logic to treat left and anti joins the same.
Seems it's not :-(

> (A semijoin B on (Pab)) antijoin C on (Pbc)
> = A semijoin (B antijoin C on (Pbc)) on (Pab)

> I think this one is true, and it doesn't seem to be mentioned, unless
> I'm missing something. It seems potentially useful.

Hmm, it doesn't seem terribly well-defined --- the values of B are
indeterminate above the semijoin in the first case, so having Pbc refer
to them doesn't seem like a good idea. In particular, it seems like in
the first case the semijoin could randomly choose a B row that has a
join partner in C, causing the A row to disappear from the result, when
the same A row has another B partner that does not join to C --- and the
second form would find that B partner and allow the A row to be output.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message ITAGAKI Takahiro 2009-02-12 00:57:25 Re: temporarily stop autovacuum
Previous Message Kevin Grittner 2009-02-12 00:13:53 Re: Optimization rules for semi and anti joins