AW: A fine point about OUTER JOIN semantics

From: Zeugswetter Andreas SB <ZeugswetterA(at)wien(dot)spardat(dot)at>
To: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject: AW: A fine point about OUTER JOIN semantics
Date: 2000-09-05 14:56:59
Message-ID: 11C1E6749A55D411A9670001FA687963368065@sdexcsrv1.f000.d0188.sd.spardat.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I can tell you the results Informix produces:

> Am I right in thinking that the WHERE clause of a query must logically
> be applied *after* any joins specified in the FROM clause?
>
> For example, suppose that we have table t1 (x int) containing the
> values 1, 2, 3, 4, and table t2 (y int) containing the values 1, 2, 4.
> It's clear that the result of
> SELECT * FROM t1 LEFT JOIN t2 ON (x = y);
> should be
> x y
>
> 1 1
> 2 2
> 3 NULL
> 4 4

same

>
> But suppose we make the query
> SELECT * FROM t1 LEFT JOIN t2 ON (x = y) WHERE y <> 2;
> It seems to me this should yield
> x y
>
> 1 1
> 3 NULL
> 4 4
>
> and not
> x y
>
> 1 1
> 2 NULL
> 3 NULL
> 4 4

x y

1 1
4 4

>
> which is what you'd get if the y=2 tuple were filtered out before
> reaching the left-join stage. Does anyone read the spec differently,
> or get the latter result from another implementation?
>
> The reason this is interesting is that this example breaks a rather
> fundamental assumption in our planner/optimizer, namely that WHERE
> conditions can be pushed down to the lowest level at which all the
> variables they mention are available. Thus the planner would normally
> apply "y <> 2" during its bottom-level scan of t2, which
> would cause the
> LEFT JOIN to decide that x = 2 is an unmatched value, and thus produce
> a "2 NULL" output row.
>
> An even more interesting example is
> SELECT * FROM t1 FULL JOIN t2 ON (x = y AND y <> 2);
> My interpretation is that this should produce
> x y
>
> 1 1
> 2 NULL
> NULL 2
> 3 NULL
> 4 4

x y

1 1
4 4

> since both t1's x=2 and t2's y=2 tuple will appear "unmatched".
> This is *not* the same output you'd get from
> SELECT * FROM t1 FULL JOIN t2 ON (x = y) WHERE y <> 2;
> which I think should yield
> x y
>
> 1 1
> 3 NULL
> 4 4
> This shows that JOIN/ON conditions for outer joins are not
> semantically
> interchangeable with WHERE conditions.

x y

1 1
4 4

>
> This is going to be a bit of work to fix, so I thought I'd better
> confirm that I'm reading the spec correctly before I dive into it.

No idea if they interpret correctly, but seems they hand it interchangeably.
Someone want to check Oracle and MS Sql ?

Andreas

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-09-05 15:30:15 Re: AW: A fine point about OUTER JOIN semantics
Previous Message Leandro Fanzone 2000-09-05 14:41:23 C++ library probs