OUTER JOIN problem

From: Zoltan Boszormenyi <zboszor(at)freemail(dot)hu>
To: pgsql-sql(at)postgresql(dot)org
Subject: OUTER JOIN problem
Date: 2004-06-23 19:44:15
Message-ID: 40D9DD8F.50306@freemail.hu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Hi,

I have a problem with LEFT OUTER JOIN, not only in PostgreSQL
but in Informix, too. I thought someone can explain the "bug"
I am seeing. Let me qoute my psql session, forgive me, it's a
bit long:

$ LANG=C psql -h localhost -U postgres postgres
Welcome to psql 7.3.4-RH, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms
\h for help with SQL commands
\? for help on internal slash commands
\g or terminate with semicolon to execute query
\q to quit

postgres=# create table a (i serial, t varchar(40));
JELZÉS: CREATE TABLE will create implicit sequence 'a_i_seq' for SERIAL
column 'a.i'
CREATE TABLE
postgres=# create table b (i serial, t varchar(40));
JELZÉS: CREATE TABLE will create implicit sequence 'b_i_seq' for SERIAL
column 'b.i'
CREATE TABLE
postgres=# create table c (i serial, a integer, b integer);
JELZÉS: CREATE TABLE will create implicit sequence 'c_i_seq' for SERIAL
column 'c.i'
CREATE TABLE

... Insert some records into all three tables ...
(Actually table 'b' is not used in the SELECTs,
table 'c' would store connections between 'a' and 'b'
in the application's broader context.)

postgres=# select * from a;
i | t
---+---
1 | 1
2 | 2
3 | 3
4 | 4
(4 rows)

postgres=# select * from b;
i | t
---+---
1 | 5
2 | 6
3 | 7
4 | 8
(4 rows)

postgres=# select * from c;
i | a | b
---+---+---
1 | 1 | 1
2 | 1 | 2
3 | 1 | 4
4 | 2 | 3
5 | 3 | 1
6 | 3 | 2
7 | 4 | 4
(7 rows)

postgres=# select * from a left outer join c on (a.i=c.a) where c.b is
null or c.b=1;
i | t | i | a | b
---+---+---+---+---
1 | 1 | 1 | 1 | 1
3 | 3 | 5 | 3 | 1
(2 rows)

Let's try something:

postgres=# delete from c;
DELETE 7
postgres=# insert into c (a,b) values (1,1);
INSERT 18490 1
postgres=# select * from a left outer join c on (a.i=c.a) where c.b is
null or c.b=1;
i | t | i | a | b
---+---+---+---+---
1 | 1 | 8 | 1 | 1
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
(4 rows)

postgres=# insert into c (a,b) values (1,3);
INSERT 18491 1
postgres=# select * from a left outer join c on (a.i=c.a) where c.b is
null or c.b=1;
i | t | i | a | b
---+---+---+---+---
1 | 1 | 8 | 1 | 1
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
(4 rows)

postgres=# insert into c (a,b) values (3,1);
INSERT 18492 1
postgres=# select * from a left outer join c on (a.i=c.a) where c.b is
null or c.b=1;
i | t | i | a | b
---+---+----+---+---
1 | 1 | 8 | 1 | 1
2 | 2 | | |
3 | 3 | 10 | 3 | 1
4 | 4 | | |
(4 rows)

Now I get the results I want. Let's insert some more data:

postgres=# insert into c (a,b) values (2,3);
INSERT 18494 1
postgres=# insert into c (a,b) values (3,1);
INSERT 18495 1
postgres=# insert into c (a,b) values (4,4);
INSERT 18496 1
postgres=# select * from a left outer join c on (a.i=c.a) where c.b is
null or c.b=1;
i | t | i | a | b
---+---+----+---+---
1 | 1 | 8 | 1 | 1
3 | 3 | 10 | 3 | 1
3 | 3 | 13 | 3 | 1
(3 rows)

Again I don't get the data I want. I accidentally inserted duplicated
data, (a,b)=(3,1) twice. (Working without unique indexes may backfire.)
The original dataset at the beginning of my example did not contain
duplicated data.

I don't know how PostgreSQL works internally but this bug *must* be
conforming to some standard if two distinct SQL server products behave
(almost) the same. I said almost, I discovered the same annoyance today
on an Informix 9.21 running under SCO UNIX on a slightly larger dataset
with less than 70 rows. It just left out some arbitrary rows that had
NULLs from the right side table (i.e not existing rows).

The following (not exactly SQL conform) construct works in Informix 9.21
and always gives me all the rows I wanted and no more:

select * from a, outer b where a.i=c.a and (c.b is null or c.b=1);

I know I can substitute OUTER JOIN with a UNION of an INNER JOIN and
a SELECT from the left table WHERE NOT EXISTS (SELECT from the right-
side table WHERE condition). But that's the point of the OUTER JOIN,
isn't it? Now can someone tell me whether it is a real bug in BOTH SQL
servers? Or is it a conforming behaviour to some part of the SQL
standard? Then please, point me where to RTFM?

I am not on the list, please Cc: me.

Thanks in advance,
Zoltán Böszörményi

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message sad 2004-06-24 04:56:50 Re: feature request ?
Previous Message Josh Berkus 2004-06-23 17:12:36 Re: feature request ?