Re: Speeding up aggregates

From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Speeding up aggregates
Date: 2002-12-06 21:55:44
Message-ID: 1039211743.2069.25.camel@rh72.home.ee
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Tom Lane kirjutas L, 07.12.2002 kell 02:42:
> Hannu Krosing <hannu(at)tm(dot)ee> writes:
> > This should also make it easier to implement all kinds of GROUP BY
> > ROLLUP|CUBE|GROUPING SETS|() queries.
>
> > Do you have any near-term plans for doing them ?
>
> Not me.

I'll try to look into it then.

No promises about when it will be ready ;)

> > Is there a variable to set that would disable one or another, like we
> > currently have for disabling various join strategies ?
>
> enable_hashagg. I didn't think about one to prevent the old style.
>
> >> Note that even though there's no SORT, the sort_mem setting is used
> >> to determine the allowable hashtable size, so a too-small sort_mem
> >> might discourage the planner from selecting hashed aggregation.
>
> > Do you mean that hashed aggregation can't overflow to disk, or would it
> > just be too slow ?
>
> I didn't write any code to let it overflow to disk --- didn't seem
> likely to be useful. (You're probably better off with a sort-based
> aggregation if there are too many distinct grouping keys.)

For simple GROUP BY this is most likely so, but for CUBE or GROUPING SETS
it may still be faster to overflow to disk than to do N passes over data
different ordering.

Of course we could use a combined approach here - do it the old way (sort) for
main body + run a parallel hashed aggregation for other, out of order groups.

------------
Hannu

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Hannu Krosing 2002-12-06 21:57:31 Re: Speeding up aggregates
Previous Message Hannu Krosing 2002-12-06 21:49:58 Re: Speeding up aggregates