Re: OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets

From: "Robert Bedell" <robert(at)friendlygenius(dot)com>
To: "'Hannu Krosing'" <hannu(at)tm(dot)ee>
Cc: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets
Date: 2003-12-17 23:55:59
Message-ID: 200312171856370.SM00984@xavier
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I guess that by adding hash aggregates Tom solved most problems of
> adding ROLLUP, CUBE and GROUPING SETS.
>
> OTOH, I'm not sure if hash aggregates can already spill to disk if not
> enough memory is available for keeping them all. If not, then adding
> this capability would be great push towards their general use for
> GROUPING SETS.
>
> ALso, a mix of scan-over-sorted-group-by + hash aggregates for
> out-of-order extra groups would be great way to using less memory for
> hash aggregates.

The other issue is that in a scan-over-sorted-group-by without out of order
grouping sets you can return tuples as you reset the aggregators. With out
of order grouping sets you would have to wait until the whole table was
scanned - at least for those grouping sets - to return the resulting tuples.
Since this could get rather large the spill to disk functionality is
necessary. It should probably mimic how the sort does it...

Another point is selecting the best way to sort a given collection of
grouping sets for minimal memory usage. Any ORDER BY in the query should
really be applied after the grouping operation.

The CUBE and ROLLUP operators should really be applied by expanding them
into the equivalent collections of grouping sets.

Cheers,

Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2003-12-18 00:08:34 Re: OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets
Previous Message Hannu Krosing 2003-12-17 23:32:40 Re: OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets