On Mon, Sep 27, 2010 at 1:15 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I've been looking at the bug reported by Kirill Simonov here:
> This is pretty ugly :-(. What we have is
> select * from A
> left join (select some-expressions from B left join C ...)
> The expressions are not guaranteed nullable; that is, they won't
> automatically produce nulls from all-null rows for B and C.
> This means that we have to evaluate the expressions at the join of
> B and C, producing PlaceHolderVars, which then bubble up through
> the left join with A, and will get replaced with nulls in any
> null-extended row of that join. Then references to them in the final
> select list will correctly show as nulls.
> The problem is that there isn't anything guaranteeing that the B/C join
> will be formed before joining to A. In Kirill's example the planner
> thinks it should commute the two joins, and then it evaluates the
> PlaceHolderVars at the now-upper B/C join, meaning they fail to go to
> nulls in rows where they should.
> This is a fundamental oversight in the original design of the
> PlaceHolderVar mechanism. I'm not sure how come I didn't see that some
> sort of join-order interlock was necessary, but I didn't.
> The most trustworthy fix that I can think of goes like this:
> 1. Add an extra pass over the parsetree early in query_planner() to
> locate PlaceHolderVars, set up the PlaceHolderInfo list, and remember the
> syntactic level of each actually-used PlaceHolderVar. It's slightly
> annoying to have to make an extra tree traversal for this, but at least
> we can skip it in the common case where no PlaceHolderVars were created.
> (Currently, this processing happens as a side-effect during
> distribute_qual_to_rels, but that's too late for what we need to do in
> point 2. Also, making the PlaceHolderInfos immediately when the
> PlaceHolderVars are created would be unpalatable since we don't know at
> that stage whether the PlaceHolderVars are actually used above the
> relevant outer join.)
> 2. In make_outerjoininfo, scan the PlaceHolderInfo list to see if there
> are any placeholders that should be evaluated below the current outer join
> and are needed above it. If so, enlarge the outer join's min_righthand
> set to ensure that it won't get moved below where the placeholder has to
> be evaluated. We can fold the existing fix_placeholder_eval_levels()
> logic into this operation, which will make that code less fragile than it
> is now.
> This is a larger change than I would prefer to back-patch, but the only
> less-invasive alternative I can see is to lobotomize the PlaceHolderVar
> mechanism entirely by reverting to 8.3-style logic wherein we prevented
> pullup of sub-selects that would require introduction of placeholders.
> That would undo a significant optimization feature of 8.4, one that
> I believe we're now relying on for reasonable performance of some system
> Thoughts, better ideas?
Personally, I would rather back-patch a more invasive bug fix than a
The Enterprise Postgres Company
In response to
pgsql-hackers by date
|Next:||From: Robert Haas||Date: 2010-09-27 17:33:05|
|Subject: Re: client socket TIME_WAIT state after PQfinish|
|Previous:||From: Tom Lane||Date: 2010-09-27 17:21:39|
|Subject: Re: Improving prep_buildtree used in VPATH builds |