Re: 7.2. Table Expressions FULL join is only supported with merge-joinable join conditions

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Grzegorz Szpetkowski" <gszpetkowski(at)gmail(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc: <pgsql-docs(at)postgresql(dot)org>
Subject: Re: 7.2. Table Expressions FULL join is only supported with merge-joinable join conditions
Date: 2011-06-13 21:47:28
Message-ID: 4DF63F20020000250003E56D@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Grzegorz Szpetkowski <gszpetkowski(at)gmail(dot)com> wrote:

>> "The join condition specified with ON can also contain conditions
>> that do not relate directly to the join."

I think the trouble starts with that sentence, which I believe to be
completely false and misleading.

Simplifying a real-life instance of such confusion among our
programmers:

SELECT <Party columns>, Demographic.dob
FROM Party LEFT JOIN Demographic
ON (<Party.pkcols = Demographic.pkcols>
AND Demographic.dod is NULL);

Which makes absence of date of death part of the outer join
criteria. So you get all the parties, dead or alive; and only show
date of birth for those not known to be dead. What they really
wanted to do was exclude parties known to be dead, and for those
parties listed, show date of birth if available. So they wanted:

SELECT <Party columns>, Demographic.dob
FROM Party LEFT JOIN Demographic
ON (<Party.pkcols = Demographic.pkcols>)
WHERE Demographic.dod is NULL;

Conditions in the ON clause *do* relate to the JOIN -- it's just
that the join might be on conditions other than primary key
equality.

Let's not contribute to muddy thinking by making incorrect
statements like that.

> I don't have a clear feeling for exactly what is needed.

I think the thing which is most likely to surprise people is that
the result can contain rows which are not in the Cartesian product
of joining the two relations. We might want to point that out,
mention that it's an OUTER JOIN in *both* directions, and maybe give
an example which is half-way plausible as a use-case. Maybe
something similar to:

test=# create table n_en (n int, word text);
CREATE TABLE
test=# create table n_de (n int, wort text);
CREATE TABLE
test=# insert into n_en values (1,'one'),(2,'two');
INSERT 0 2
test=# insert into n_de values (2, 'zwei'),(3,'drei');
INSERT 0 2
test=# select * from n_en full join n_de using (n);
n | word | wort
---+------+------
1 | one |
2 | two | zwei
3 | | drei
(3 rows)

And that works to show the difference between:

test=# select * from n_en full join n_de
test-# on (n_en.n = n_de.n and n_de.n > 2);
n | word | n | wort
---+------+---+------
1 | one | |
2 | two | |
| | 2 | zwei
| | 3 | drei
(4 rows)

and:

test=# select * from n_en full join n_de
test-# on (n_en.n = n_de.n) where n_de.n > 2;
n | word | n | wort
---+------+---+------
| | 3 | drei
(1 row)

-Kevin

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message Ross J. Reedstrom 2011-06-13 22:11:32 Re: Documentation and explanatory diagrams
Previous Message Alvaro Herrera 2011-06-13 21:26:39 Re: Improve warnings around CREATE INDEX CONCURRENTLY