Re: Parallel Aggregate

From: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To: Paul Ramsey <pramsey(at)cleverelephant(dot)ca>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Parallel Aggregate
Date: 2015-12-21 23:35:18
Message-ID: CAJrrPGcheLgAW0WaGXcQvXN=Hc9N4LWgMXMnc7r7c7akHbKD8g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 22, 2015 at 2:16 AM, Paul Ramsey <pramsey(at)cleverelephant(dot)ca> wrote:
> Shouldn’t parallel aggregate come into play regardless of scan selectivity?
> I know in PostGIS land there’s a lot of stuff like:
>
> SELECT ST_Union(geom) FROM t GROUP BY areacode;
>
> Basically, in the BI case, there’s often no filter at all. Hoping that’s
> considered a prime case for parallel agg :)

Yes, the latest patch attached in the thread addresses this issue.
But it still lacks of proper cost calculation and comparison with
original aggregate cost.

The parallel aggregate selects only when the number of groups
should be at least less than 1/4 of rows that are getting selected.
Otherwise, doing aggregation two times for more number of
records leads to performance drop compared to original aggregate.

Regards,
Hari Babu
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2015-12-21 23:38:15 Re: Parallel Aggregate
Previous Message David Rowley 2015-12-21 23:28:22 Re: Patch to improve a few appendStringInfo* calls