Re: Fixing Grittner's planner issues

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Subject: Re: Fixing Grittner's planner issues
Date: 2009-02-06 04:34:06
Message-ID: 603c8f070902052034hb2ea783ic75061d9aef9db30@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I think the only fix for this that's practical in the 8.4 time frame is
> to give up trying to flatten EXISTS/NOT EXISTS that occur within the ON
> condition of an outer join, ie, revert to 8.3's level of intelligence
> about this case. It seems like a general purpose solution would involve
> somehow being able to separate the semantic effects of an outer join
> (in particular, null-insertion) from the time at which it's performed,
> and I've really got no idea how to do that or what consequences it would
> have for both the planner and the executor.

I think that A LEFT JOIN (B ANTI JOIN C) is equivalent to (A LEFT JOIN
B) LEFT JOIN (UNIQUE C) if you rewrite each attribute provided by b as
CASE WHEN (no matching tuple found in C) THEN b.attribute END.

On a related note, I have some vague unease about planning A SEMI JOIN
B as A INNER JOIN (UNIQUE B), as make_one_rel currently attempts to
do. For a merge join or nested loop, I don't see how this can ever be
a win over teaching the executor to just not rescan B. For a hash
join, it can be a win if B turns out to have duplicates, but then
again you could also just teach the executor to skip the insertion of
the duplicate into the table in the first place (it has to hash 'em
anyway...). I think maybe I'm not understanding something about the
logic here.

It seems like you might want some infrastructure for cases where we
know we are just probing the inner side of the relation for a match.
If you find at least one, you either make a decision as to whether or
not to filter the tuple (regular semi/anti-join) or whether or not to
force certain fields to NULL (semi/anti-join under ON-clause). But
all these cases can share the
we-don't-care-about-multiple-inner-matches logic.

> Reflecting on this further, I suspect there are also some bugs in the
> planner's rules about when semi/antijoins can commute with other joins;
> but that's not directly causing Kevin's problem, because the rules do
> make the correct decision for this case.

One thing I notice is that src/backend/optimizer/README should
probably be updated with the rules for commuting SEMI and ANTI joins;
it currently only mentions INNER, LEFT, RIGHT, and FULL.

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message K, Niranjan (NSN - IN/Bangalore) 2009-02-06 05:14:48 Re: Synch Replication
Previous Message Koichi Suzuki 2009-02-06 03:16:43 Re: 8.4 release planning