Re: [PERFORM] Outer joins and equivalence

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [PERFORM] Outer joins and equivalence
Date: 2008-06-05 05:17:47
Message-ID: 1212643067.4148.244.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance


On Wed, 2008-06-04 at 22:18 -0400, Tom Lane wrote:
> [ redirecting thread from -performance to -hackers ]
>
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > I've got a test case which shows something related and weird, though not
> > the exact case.
>
> > The queries shown here have significantly different costs, depending
> > upon whether we use tables a or b in the query. Since a and b are
> > equivalent this result isn't expected at all.
>
> Hmm. I had been guessing that there was something about your original
> query that prevented the system from applying best_appendrel_indexscan,
> but after fooling with this a bit, I don't believe that's the issue
> at all. The problem is that these two cases "should" be equivalent:
>
> select ... from a join b on (a.id = b.id) left join c on (a.id = c.id);
>
> select ... from a join b on (a.id = b.id) left join c on (b.id = c.id);
>
> but they are not seen that way by the current planner. It correctly
> forms an EquivalenceClass consisting of a.id and b.id, but it cannot put
> c.id into that same class, and so the clause a.id = c.id is just left
> alone; there is noplace that can generate "b.id = c.id" as an
> alternative join condition. This means that (for the first query)
> we can consider the join orders "(a join b) leftjoin c" and
> "(a leftjoin c) join b", but there is no way to consider the join
> order "(b leftjoin c) join a"; to implement that we'd need to have the
> alternative join clause available. So if that join order is
> significantly better than the other two, we lose.
>
> This is going to take a bit of work to fix :-(. I am toying with the
> idea that we could go ahead and put c.id into the EquivalenceClass
> as a sort of second-class citizen that's labeled as associated with this
> particular outer join --- the implication being that we can execute the
> outer join using a generated clause that equates c.id to any one of the
> first-class members of the EquivalenceClass, but above the outer join
> we can't assume that c.id hasn't gone to null, so it's not really equal
> to anything else in the class. I think it might also be possible
> to get rid of the reconsider_outer_join_clauses() kluge in favor of
> driving transitive-equality-to-a-constant off of this representation.

Yes, EquivalenceClass allows an implication to be made either way
around, which is wrong for this class of problem. I was imagining a
higher level ImplicationClass that was only of the form A => B but not B
=> A. So we end up with an ImplicationTree, rather than a just a flat
Class. Which is where I punted...

> However there's a larger issue here, which is the very identity of an
> outer join :-(. Currently, for the first query above, the left join
> is defined as being between a and c, with a being the minimum
> left-hand-side needed to form the join. To be able to handle a case
> like this, it seems that the notion of a "minimum left hand side"
> falls apart altogether. We can execute the OJ using either a or b
> as left hand side. So the current representation of OuterJoinInfo,
> and the code that uses it to enforce valid join orders, needs a serious
> rethink.

Hadn't seen that at all.

> Looks like 8.4 development material to me, rather than something we
> can hope to back-patch a fix for...

Definitely.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-06-05 06:13:47 Re: orafce does NOT build with Sun Studio compiler
Previous Message Mayuresh Nirhali 2008-06-05 05:14:23 orafce does NOT build with Sun Studio compiler

Browse pgsql-performance by date

  From Date Subject
Next Message Heikki Linnakangas 2008-06-05 06:18:29 Re: insert/update tps slow with indices on table > 1M rows
Previous Message Tom Lane 2008-06-05 02:18:12 Re: [PERFORM] Outer joins and equivalence