Oops, I seem to have changed UNION's behavior

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Oops, I seem to have changed UNION's behavior
Date: 1999-02-07 18:46:42
Message-ID: 5172.918413202@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The equal() updates I installed yesterday (to fix the "don't know
whether nodes of type 600 are equal" problem) have had an unintended
side effect.

Am I right in thinking that UNION (without ALL) is defined to do a
DISTINCT on its result, so that duplicates are removed even if the
duplicates both came from the same source table? That's what 6.4.2
does, but I do not know if it's strictly kosher according to the SQL
spec.

If so, the code is now busted, because with the equal() extension in
place, cnfify() is able to recognize and remove duplicate select
clauses. That is, "SELECT xxx UNION SELECT xxx" will be folded to
just "SELECT xxx" ... and that doesn't mean the same thing.

An actual example: given the data

play=> select a from tt;
a
-
1
1
2
3
(4 rows)

Under 6.4.2 I get:

play=> select a from tt union select a from tt;
a
-
1
2
3
(3 rows)

Note lack of duplicate "1". Under current sources I get:

ttest=> select a from tt union select a from tt;
a
-
1
1
2
3
(4 rows)

since the query is effectively reduced to just "select a from tt".

Assuming that 6.4.2 is doing the Right Thing, I see two possible fixes:
(1) simplify equal() to say that two T_Query nodes are never equal, or
(2) modify the planner so that the "select distinct" operation is
inserted explicitly, and will thus happen even if the UNIONed selects
are collapsed into just one.

(1) is a trivial fix of course, but it worries me --- maybe someday
we will need equal() to give an honest answer for Query nodes.
But I don't have the expertise to apply (2), and it seems like rather
a lot of work for a boundary case that isn't really interesting in
practice.

Comments? *Is* 6.4.2 behaving according to the SQL spec?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Meskes 1999-02-07 19:04:36 Re: [HACKERS] DEC OSF1 Compilation problems
Previous Message Stupor Genius 1999-02-07 18:39:59 RE: [HACKERS] Problems with >2GB tables on Linux 2.0