Re: [HACKERS] Gsets: ROW expression semantic broken between 9.4 and 9.5

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: [HACKERS] Gsets: ROW expression semantic broken between 9.4 and 9.5
Date: 2015-07-26 15:06:30
Message-ID: 87mvyjxbi6.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

>>>>> "Jeevan" == Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com> writes:

Jeevan> Hi
Jeevan> It looks like we have broken the ROW expression without
Jeevan> explicit ROW keyword in GROUP BY.

Andres has given the short version, but here's the long version:

In the spec, GROUP BY ROW(a,b) is an error, while GROUP BY (a,b) is
exactly equivalent to GROUP BY a,b.

Previously, pg treated GROUP BY (a,b) as if it were GROUP BY ROW(a,b)
since it was parsing it as an expression, and (a,b) in an expression is
shorthand for ROW(a,b). However, the parens are significant in many
contexts in the grouping set syntax, e.g. ROLLUP(a,(b,c)) is equivalent
to GROUPING SETS ((a,b,c), (a), ()), and we have to be able to parse
both GROUPING SETS (a,b) (which is two grouping sets) and GROUPING SETS
((a,b),(c,d)), which means that we can't actually use the grammar to
distinguish expressions from parenthesized sublists. What the code
therefore does is to explicitly distinguish (a,b) and ROW(a,b), and
treat the first as a list and the second as a single expression.

This is documented in the following NOTE in queries.sgml:

<note>
<para>
The construct <literal>(a,b)</> is normally recognized in expressions as
a <link linkend="sql-syntax-row-constructors">row constructor</link>.
Within the <literal>GROUP BY</> clause, this does not apply at the top
levels of expressions, and <literal>(a,b)</> is parsed as a list of
expressions as described above. If for some reason you <emphasis>need</>
a row constructor in a grouping expression, use <literal>ROW(a,b)</>.
</para>
</note>

http://www.postgresql.org/docs/9.5/static/queries-table-expressions.html#QUERIES-GROUPING-SETS

Andres has suggested that this should be mentioned in an incompatibility
note in the release notes. I'm not sure that's needed, since I don't
believe there are any cases where previously valid queries change in
behavior; a query such as

select (a,b) from (values (1,2)) v(a,b) group by (a,b);

previously evaluated the row constructor before grouping, while now it
groups by a and b separately and evaluates the row constructor
afterwards. If there's a way to make this change affect the result,
I've not found it yet, even when using volatile functions.

--
Andrew (irc:RhodiumToad)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-07-26 15:13:25 Re: [HACKERS] Grouping Sets: Fix unrecognized node type bug
Previous Message Fabien COELHO 2015-07-26 15:01:50 Re: checkpointer continuous flushing

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2015-07-26 15:13:25 Re: [HACKERS] Grouping Sets: Fix unrecognized node type bug
Previous Message Andres Freund 2015-07-26 14:13:41 Re: [HACKERS] Grouping Sets: Fix unrecognized node type bug