Re: Parallel Aggregate

From: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Paul Ramsey <pramsey(at)cleverelephant(dot)ca>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Aggregate
Date: 2015-12-21 04:23:55
Message-ID: CAJrrPGe-GJsJYrimoDJHUOha58-VMFG92txdaPB-5YUBtwAKTQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Dec 19, 2015 at 5:39 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Dec 16, 2015 at 5:59 AM, David Rowley
> <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
>> One thing I noticed is that you're only enabling Parallel aggregation when
>> there's already a Gather node in the plan. Perhaps this is fine for a proof
>> of concept, but I'm wondering how we can move forward from this to something
>> that can be committed.
>
> As far as that goes, I think the infrastructure introduced by the
> parallel join patch will be quite helpful here. That introduces the
> concept of a "partial path" - that is, a path that needs a Gather node
> in order to be completed. And that's exactly what you need here:
> after join planning, if there's a partial path available for the final
> rel, then you can consider
> FinalizeAggregate->Gather->PartialAggregate->[the best partial path].
> Of course, whether a partial path is available or not, you can
> consider Aggregate->[the best regular old path].

Thanks for the details.
Generated partial aggregate plan on top of partial path list that is available.
The code changes are took from the parallel join patch for reference.

Instead of generating parallel aggregate plan on top of partial path list
if exists, how about checking the cost of normal aggregate and parallel
aggregate and decide which one best?

The parallel aggregate patch is now separated from combine aggregate patch.
The latest combine aggregate patch is also attached in the mail for reference
as parallel aggregate patch depends on it.

Attached latest performance report. Parallel aggregate is having some overhead
in case of low selectivity.This can be avoided with the help of cost comparison
between normal and parallel aggregates.

Regards,
Hari Babu
Fujitsu Australia

Attachment Content-Type Size
performance_test_result_21_12_2015.xlsx application/vnd.openxmlformats-officedocument.spreadsheetml.sheet 12.4 KB
combine_aggregate_state_789a9af_2015-12-18 (1).patch application/octet-stream 73.4 KB
parallelagg_poc_v3.patch application/octet-stream 61.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2015-12-21 05:07:36 Re: [PoC] Asynchronous execution again (which is not parallel)
Previous Message Craig Ringer 2015-12-21 04:12:55 Re: psql - -dry-run option