Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)

From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Олег Царев <zabivator(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)
Date: 2009-08-13 16:07:58
Message-ID: e08cc0400908130907ya1d1902p77c533f3a6b1066f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2009/8/8 Alvaro Herrera <alvherre(at)commandprompt(dot)com>:
> Олег Царев escribió:
>> Hello all!
>> If no one objecte (all agree, in other say) i continue work on patch -
>> particulary, i want support second strategy (tuple store instead of
>> hash-table) for save order of source (more cheap solution in case with
>> grouping sets + order by), investigate and brainstorm another
>> optimisation, writing regression tests and technical documentation.
>> But I need some time for complete my investigation internals of
>> PostgreSQL, particulary CTE.
>
> Where are we on this patch?  Is it moving forward?
>

It seems to me that the patch goes backward.

I looked trough the gsets-0.6.diff for about an hour, and found it is
now only a syntax sugar that builds multiple GROUP BY queries based on
CTE functionality. There's no executor modification.

If I remember correctly, the original patch touched executor parts.
I'd buy if the GROUPING SETS touches executor but I don't if this is
only syntax sugar, because you can write it as the same by yourself
without GROUPING SETS syntax. The motivation we push this forward is
performance that cannot be made by rewriting query, I guess.

Because GROUP BY we have today is a subset of GROUPING SETS by
definition, I suppose we'll refactor nodeAgg.c so that it is allowed
to take multiple group definitions. And we must support both of
HashAgg and GroupAgg. For HashAgg, it is easier in any case as the
earlier patch does. For GroupAgg, it is a bit complicated since we
sort by different key sets.

When we want GROUPING SET(a, b), at first we sort by a and aggregate
then sort by b and aggregate. This is the same as:

select a, null, count(*) from x group by a
union all
select null, b, count(*) from x group by b

so nothing better than query rewriting unless we invent something new.

But in case of sub total and grand total like ROLLUP query, GroupAgg
can do it by one-time scan by having multiple life cycle PerGroup
state.

Anyway, before going ahead we need to find rough sketch of how to
implement this feature. Only syntax sugar is acceptable? Or internal
executor support is necessary?

Regards,

--
Hitoshi Harada

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Олег Царев 2009-08-13 16:22:27 Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)
Previous Message Boszormenyi Zoltan 2009-08-13 15:55:53 Re: DECLARE doesn't set/reset sqlca after DECLARE cursor