Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)

From: Joshua Tolley <eggyknap(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Олег Царев <zabivator(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)
Date: 2009-05-13 03:09:09
Message-ID: 20090513030909.GG8468@eddie
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 12, 2009 at 11:20:14PM +0200, Pavel Stehule wrote:
> this patch has some bugs but it is good prototype (it's more stable
> than old patch):

I'm not sure if you're at the point that you're interested in bug reports, but
here's something that didn't behave as expected:

5432 josh(at)josh*# create table gsettest (prod_id integer, cust_id integer,
quantity integer);
CREATE TABLE
5432 josh(at)josh*# insert into gsettest select floor(random() * 10)::int,
floor(random() * 20)::int, floor(random() * 10)::int from generate_series(1,
100);
INSERT 0 100
5432 josh(at)josh*# select prod_id, cust_id, sum(quantity) from gsettest group by
cube (prod_id, cust_id) order by 1, 2;
prod_id | cust_id | sum
---------+---------+-----
5 | 7 | 4
8 | 16 | 3
9 | 19 | 8
4 | 13 | 3
8 | 8 | 15
5 | 2 | 4
7 | 6 | 7
6 | 6 | 3
</snip>

Note that the results aren't sorted. The following, though, works around it:

5432 josh(at)josh*# select * from (select prod_id, cust_id, sum(quantity) from
gsettest group by cube (prod_id, cust_id)) f order by 1, 2;
prod_id | cust_id | sum
---------+---------+-----
0 | 2 | 8
0 | 4 | 8
0 | 5 | 2
0 | 7 | 11
0 | 8 | 7
0 | 9 | 1
0 | 12 | 3
0 | 14 | 7
0 | 16 | 5
0 | 17 | 8
0 | 18 | 9
0 | 19 | 2
0 | | 71
</snip>

EXPLAIN output is as follows:
5432 josh(at)josh*# explain select prod_id, cust_id, sum(quantity) from gsettest
group by cube (prod_id, cust_id) order by 1, 2;
QUERY PLAN
---------------------------------------------------------------------------
Append (cost=193.54..347.71 rows=601 width=9)
CTE **g**
-> Sort (cost=135.34..140.19 rows=1940 width=12)
Sort Key: gsettest.prod_id, gsettest.cust_id
-> Seq Scan on gsettest (cost=0.00..29.40 rows=1940 width=12)
-> HashAggregate (cost=53.35..55.85 rows=200 width=12)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=12)
-> HashAggregate (cost=48.50..51.00 rows=200 width=8)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=8)
-> HashAggregate (cost=48.50..51.00 rows=200 width=8)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=8)
-> Aggregate (cost=43.65..43.66 rows=1 width=4)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=4)
(13 rows)

...and without the ORDER BY clause just to prove that it really is the reason
for the Sort step...

5432 josh(at)josh*# explain select prod_id, cust_id, sum(quantity) from gsettest
group by cube (prod_id, cust_id);
QUERY PLAN
------------------------------------------------------------------------
Append (cost=82.75..236.92 rows=601 width=9)
CTE **g**
-> Seq Scan on gsettest (cost=0.00..29.40 rows=1940 width=12)
-> HashAggregate (cost=53.35..55.85 rows=200 width=12)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=12)
-> HashAggregate (cost=48.50..51.00 rows=200 width=8)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=8)
-> HashAggregate (cost=48.50..51.00 rows=200 width=8)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=8)
-> Aggregate (cost=43.65..43.66 rows=1 width=4)
-> CTE Scan on "**g**" (cost=0.00..38.80 rows=1940 width=4)
(11 rows)

I'm hoping I'll get a chance to poke at the patch some. This could be very
useful...

- Josh / eggyknap

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2009-05-13 04:29:41 Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)
Previous Message Bruce Momjian 2009-05-13 02:01:53 Re: idea: global temp tables