Re: Design notes for EquivalenceClasses

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Zeugswetter Andreas ADI SD" <ZeugswetterA(at)spardat(dot)at>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Design notes for EquivalenceClasses
Date: 2007-01-19 16:55:06
Message-ID: 15237.1169225706@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Zeugswetter Andreas ADI SD" <ZeugswetterA(at)spardat(dot)at> writes:
> Tom Lane writes:
>> SELECT *
>> FROM a LEFT JOIN
>> (SELECT * FROM b JOIN c ON b.y = c.z WHERE b.y = 10) ss
>> ON a.x = ss.y
>> WHERE a.x = 42;
>>
>> ... In this example, notice also that
>> a.x = ss.y (really a.x = b.y) is not an equivalence clause because its
>> applicability to b is restricted by the outer join; thus we do not make
>> the mistake of concluding b.y = 42, even though we do have an equivalence
>> class for {a.x 42}.

> I am not sure I understand the logic behind the above restriction
> though. Although b.y cannot be in the EquivalenceClass as described,
> it still seems important/possible to push down b.y = 42 into ss.

Hmmm ... yeah, you're right, this example needs revision because we
actually do create {b.y 42} as a "below outer join" equivalence.
In fact with patch I get a plan like

Nested Loop Left Join (cost=76.05..139.42 rows=1331 width=12)
-> Seq Scan on a (cost=0.00..36.75 rows=11 width=4)
Filter: (x = 42)
-> Materialize (cost=76.05..77.26 rows=121 width=8)
-> Result (cost=36.76..75.93 rows=121 width=8)
One-Time Filter: (42 = 10)
-> Nested Loop (cost=36.76..75.93 rows=121 width=8)
-> Seq Scan on b (cost=0.00..36.75 rows=11 width=4)
Filter: (y = 10)
-> Materialize (cost=36.76..36.87 rows=11 width=4)
-> Seq Scan on c (cost=0.00..36.75 rows=11 width=4)
Filter: (z = 10)

which'll cause it to not evaluate the b/c join at all, as you suggested.
8.2 also realizes that b.y=42 is required, but it's a lot stupider about
what to do with the knowledge:

Hash Left Join (cost=81.79..118.59 rows=11 width=12)
Hash Cond: (a.x = b.y)
-> Seq Scan on a (cost=0.00..36.75 rows=11 width=4)
Filter: (x = 42)
-> Hash (cost=81.65..81.65 rows=11 width=8)
-> Hash Join (cost=42.11..81.65 rows=11 width=8)
Hash Cond: (c.z = b.y)
-> Seq Scan on c (cost=0.00..31.40 rows=2140 width=4)
-> Hash (cost=42.10..42.10 rows=1 width=4)
-> Seq Scan on b (cost=0.00..42.10 rows=1 width=4)
Filter: ((y = 10) AND (y = 42))

Notice 8.2 also fails to derive c.z=10.

> It seems what we want in addition to EquivalenceClasses, is logic to
> push (or rather copy) down a restriction but keep the upperlevel part
> of it for outer joins.

No, the bit that I was missing when I wrote that sentence was the
concept of a "below outer join" EquivalenceClass that allows values
to go to null.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2007-01-19 17:02:34 Re: Fix for bug in plpython bool type conversion
Previous Message Alvaro Herrera 2007-01-19 15:51:27 Re: Windows buildfarm failures