Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)

From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Олег Царев <zabivator(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)
Date: 2009-08-13 17:49:49
Message-ID: e08cc0400908131049s15495fc5y42e0d102a0ae77c2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2009/8/14 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
> I prefered using CTE, because this way was the most short to small
> bugs less prototype - with full functionality.

You could make it by query rewriting, but as you say the best cleanest
way is total refactoring of existing nodeAgg. How easy to implement is
not convincing.

>> When we want GROUPING SET(a, b), at first we sort by a and aggregate
>> then sort by b and aggregate. This is the same as:
>>
>> select a, null, count(*) from x group by a
>> union all
>> select null, b, count(*) from x group by b
>>
>> so nothing better than query rewriting unless we invent something new.
>>
> the problem is when x is subquery. Then is better using CTE, because
> we don't need repeat x evaluation twice. The most typical use case is,
> so x isn't table.

So we need single scan aggregate as far as possible. Buffering
subquery's result is possible without CTE node. Tuplestore has that
functionality but I found the buffered result will be sorted multiple
times, one way might be to allow tuplesort to perform sort multiple
times with different keys.

Regards,

--
Hitoshi Harada

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2009-08-13 17:55:36 Re: Hot standby and synchronous replication status
Previous Message Tom Lane 2009-08-13 17:44:20 Re: Hot standby and synchronous replication status