Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Олег Царев <zabivator(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)
Date: 2009-05-10 13:21:36
Message-ID: 162867790905100621t3d4e2c94qfa7bfb895260981f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

some other info is on http://wiki.postgresql.org/wiki/Grouping_Sets

Maybe in e few weak I'll have some a prototype based on CTE. My older
technique based on hashed tables should be well, but it carry lot of
unshared code. Using CTE means so we can optimize CTE and GROUPING
SETS together. I plan to transform query

SELECT * FROM some GROUP BY GROUPING SETS(a, b)

to

WITH q AS (SELECT * FROM some)
SELECT * FROM q GROUP BY a
UNION ALL
SELECT * FROM q GROUP BY b

2009/5/10 Олег Царев <zabivator(at)gmail(dot)com>:
> Hello all.
> Please, approve my ideas for implementation.
>
> Standart has feature T431: Extended grouping capabilities.
> This feature i found in TODO-list:
> http://wiki.postgresql.org/wiki/Todo -> SQL Commands -> TO DO
>
> MS SQL 2005 partial support this feature:
> http://www.kodyaz.com/articles/sql-server-2005-cube-rollup-cannot-compute-distinct-aggregates.aspx
> http://blogs.msdn.com/craigfr/archive/2007/09/21/aggregation-with-rollup.aspx
>
> MS SQL 2008 support this feature:
> http://blogs.msdn.com/craigfr/archive/2007/10/11/grouping-sets-in-sql-server-2008.aspx
>
> Oracle support this feature:
> http://www.compshack.com/sql/oracle-group-rollup
>
> So, it's short notes about GROUPING SETS, but more complete
> information have in a official documentation of MS SQL and Oracle
> (copyright limited for send as attach).
>
> First. GROUPG SETS.
>
> select A,B,C,SUM(D) from table group by GROUPING SETS( (A,B,C), (A),
> () ) - it's example of use grouping sets.
> Semantic of this construction - make group by over source more, than
> one group of column.
> It's very wide key - A,B C. In result set of this example we can find
> result set of select   select A,B,C,SUM(D) from table group by A,B,C -
> as subset. It's mind: "GROUP BY A,B,C" - subset of "GROUP BY GROUPING
> SETS( (A,B,C), (A), () )
> Two subset - is GROUP BY A B, and instead C column we look NULL.
> Third subset - GROUP BY (), instead A,B,C - NULL, one row - it's name
> "GRAND TOTAL". - calculate over all subset without grouping
>
> Also have function "GROUPING"  it's function say about null - "real
> null" (from table) or generated by "GROUP BY GROUPING SETS"
>
> My point: this feature can implement over GROUP BY and UNION ALL
> We can make decomposition of "GROUP BY GROUPING SETS( (A,B,C),(A),()
> )" to  select A,B,C fron table GROUP BY A,B,C .UNION  ALL select
> A,B,NULL from table group BY A,B UNION ALL NUll,NUll,NULL from table
> group by();
>
> So, it's very simple, don't require modification of executor and
> callibrate cost - only parser and semantic anylysis,
> '
> So, ROLLUP(A1,...An) is alias to "GROUP BY GROUPING SETS(
> (A1,...,An),(A1,...,An-1),... (An-1,An),(An),() ),
> CUBE - analogue.
>
> If this idea it's good -  i can write code base on old patch
> http://archives.postgresql.org/pgsql-hackers/2008-10/msg00838.php or
> from clean list (as you wish).

This is suboptimal. When SELECT * FROM X GROUP BY GROUPING SETS ...,
where X is some joined data, then you repeat JOIN on every grouping
set. So your solution is simple for implementation, but it should be
really slow.

Regards
Pavel Stehule

>
> In future i know how to implement ROLLUP more optimal (executor
> iterator) and use this ROLLUP for optimisation another GROUP BY,
> GROUPING SETS.
>
> Thanks.
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2009-05-10 13:29:38 Re: SQL state in log_line_prefix
Previous Message Олег Царев 2009-05-10 12:57:06 Implementation of GROUPING SETS (T431: Extended grouping capabilities)