Quick Links

Re: possible optimization: push down aggregates

From:	"Tomas Vondra" <tv(at)fuzzy(dot)cz>
To:	"Merlin Moncure" <mmoncure(at)gmail(dot)com>
Cc:	"Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: possible optimization: push down aggregates
Date:	2014-08-27 21:25:47
Message-ID:	e8063776a8278cba845957f9a95132b9.squirrel@sq.gransy.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 27 Srpen 2014, 21:41, Merlin Moncure wrote:
> On Wed, Aug 27, 2014 at 2:07 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
>>
>> Are there some plans to use partitioning for aggregation?
>
> Besides min/max, what other aggregates (mean/stddev come to mind)
> would you optimize and how would you determine which ones could be?
> Where is that decision made?
>
> For example, could user defined aggregates be pushed down if you had a
> reaggregation routine broken out from the main one?

I think that what Pavel suggests is that when you are aggregating by

GROUP BY x

and 'x' happens to be used for partitioning (making it impossible to
groups from different partitions to overlap), then it's perfectly fine to
perform the aggregation per partition, and just append the results.

If you need sorted output, you can sort the results (assuming the
cardinality of the output is much lower than the actual data).

This "append first, then aggregate" may be the cause for switch to sort
(because of fear that the amount of group will exceed work_mem), while we
could just as fine process each partition by hash aggregate separately.

Tomas

In response to

Re: possible optimization: push down aggregates at 2014-08-27 19:41:58 from Merlin Moncure

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Merlin Moncure	2014-08-27 21:46:03	Re: possible optimization: push down aggregates
Previous Message	Heikki Linnakangas	2014-08-27 21:22:39	Re: delta relations in AFTER triggers