From: | Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> |
---|---|
To: | Tomas Vondra <tv(at)fuzzy(dot)cz> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Final Patch for GROUPING SETS - unrecognized node type: 347 |
Date: | 2014-09-07 16:52:21 |
Message-ID: | 87egvn8k0e.fsf@news-spur.riddles.org.uk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
>>>>> "Tomas" == Tomas Vondra <tv(at)fuzzy(dot)cz> writes:
>> As for computing it all twice, there's currently no attempt to
>> optimize multiple identical grouping sets into multiple
>> projections of a single grouping set result. CUBE(a,b,c,a) has
>> twice as many grouping sets as CUBE(a,b,c) does, even though all
>> the extra ones are duplicates.
Tomas> Shouldn't this be solved by eliminating the excessive
Tomas> ChainAggregate? Although it probably changes GROUPING(...),
Tomas> so it's not just about removing the duplicate column(s) from
Tomas> the CUBE.
Eliminating the excess ChainAggregate would not change the number of
grouping sets, only where they are computed.
Tomas> Maybe preventing this completely (i.e. raising an ERROR with
Tomas> "duplicate columns in CUBE/ROLLUP/... clauses") would be
Tomas> appropriate. Does the standard says anything about this?
The spec does not say anything explicitly about duplicates, so they
are allowed (and duplicate grouping _sets_ can't be removed, only
duplicate columns within a single GROUP BY clause after the grouping
sets have been eliminated by transformation). I have checked my
reading of the spec against oracle 11 and MSSQL using sqlfiddle.
The way the spec handles grouping sets is to define a sequence of
syntactic transforms that result in a query which is a UNION ALL of
ordinary GROUP BY queries. (We haven't tried to implement the
additional optional feature of GROUP BY DISTINCT.) Since it's UNION
ALL, any duplicates must be preserved, so a query with GROUPING SETS
((a),(a)) reduces to:
SELECT ... GROUP BY a UNION ALL SELECT ... GROUP BY a;
and therefore has duplicates of all its result rows.
I'm quite prepared to concede that I may have read the spec wrong
(wouldn't be the first time), but in this case I require any such
claim to be backed up by an example from some other db showing an
actual difference in behavior.
--
Andrew (irc:RhodiumToad)
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2014-09-07 17:06:04 | Re: Adding a nullable DOMAIN column w/ CHECK |
Previous Message | Emre Hasegeli | 2014-09-07 16:09:39 | Re: Selectivity estimation for inet operators |