Re: [HACKERS] Oops, I seem to have changed UNION's behavior

From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Oops, I seem to have changed UNION's behavior
Date: 1999-05-09 13:14:12
Message-ID: 199905091314.JAA21695@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Was there a conclusion on this?

> The equal() updates I installed yesterday (to fix the "don't know
> whether nodes of type 600 are equal" problem) have had an unintended
> side effect.
>
> Am I right in thinking that UNION (without ALL) is defined to do a
> DISTINCT on its result, so that duplicates are removed even if the
> duplicates both came from the same source table? That's what 6.4.2
> does, but I do not know if it's strictly kosher according to the SQL
> spec.
>
> If so, the code is now busted, because with the equal() extension in
> place, cnfify() is able to recognize and remove duplicate select
> clauses. That is, "SELECT xxx UNION SELECT xxx" will be folded to
> just "SELECT xxx" ... and that doesn't mean the same thing.
>
> An actual example: given the data
>
> play=> select a from tt;
> a
> -
> 1
> 1
> 2
> 3
> (4 rows)
>
> Under 6.4.2 I get:
>
> play=> select a from tt union select a from tt;
> a
> -
> 1
> 2
> 3
> (3 rows)
>
> Note lack of duplicate "1". Under current sources I get:
>
> ttest=> select a from tt union select a from tt;
> a
> -
> 1
> 1
> 2
> 3
> (4 rows)
>
> since the query is effectively reduced to just "select a from tt".
>
> Assuming that 6.4.2 is doing the Right Thing, I see two possible fixes:
> (1) simplify equal() to say that two T_Query nodes are never equal, or
> (2) modify the planner so that the "select distinct" operation is
> inserted explicitly, and will thus happen even if the UNIONed selects
> are collapsed into just one.
>
> (1) is a trivial fix of course, but it worries me --- maybe someday
> we will need equal() to give an honest answer for Query nodes.
> But I don't have the expertise to apply (2), and it seems like rather
> a lot of work for a boundary case that isn't really interesting in
> practice.
>
> Comments? *Is* 6.4.2 behaving according to the SQL spec?
>
> regards, tom lane
>
>

--
Bruce Momjian | http://www.op.net/~candle
maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vadim Mikheev 1999-05-09 13:56:40 Re: [HACKERS] Re: [COMMITTERS] 'pgsql/src/backend/parser gram.c'
Previous Message Bruce Momjian 1999-05-09 13:02:51 Re: [HACKERS] char(n) default '' crashes server