Re: Parallel Aggregate

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Paul Ramsey <pramsey(at)cleverelephant(dot)ca>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Aggregate
Date: 2016-03-17 12:22:15
Message-ID: CAA4eK1JOs02Z8+KuuXUECdye2Keae7ofZHHdT=o6RMTEi58Smw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 17, 2016 at 10:35 AM, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
wrote:
>
> On 17 March 2016 at 01:19, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > Few assorted comments:
> >
> > 2.
> > AggPath *
> > create_agg_path(PlannerInfo *root,
> > @@ -2397,9 +2399,11 @@ create_agg_path(PlannerInfo *root,
> >
> > List *groupClause,
> > List *qual,
> > const AggClauseCosts
> > *aggcosts,
> > - double numGroups)
> > + double numGroups,
> > +
> > bool combineStates,
> > + bool finalizeAggs)
> >
> > Don't you need to set parallel_aware flag in this function as we do for
> > create_seqscan_path()?
>
> I don't really know the answer to that... I mean there's nothing
> special done in nodeAgg.c if the node is running in a worker or in the
> main process.
>

On again thinking about it, I think it is okay to set parallel_aware flag
as false. This flag means whether that particular node has any parallelism
behaviour which is true for seqscan, but I think not for partial aggregate
node.

Few other comments on latest patch:

1.

+ /*
+ * XXX does this need estimated for each partial path, or are they
+ * all
going to be the same anyway?
+ */
+ dNumPartialGroups = get_number_of_groups(root,
+
clamp_row_est(partial_aggregate_path->rows),
+
rollup_lists,
+
rollup_groupclauses);

For considering partial groups, do we need to rollup related lists?

2.
+ hashaggtablesize = estimate_hashagg_tablesize(partial_aggregate_path,
+
&agg_costs,
+
dNumPartialGroups);
+
+ /*
+ * Generate a
hashagg Path, if we can, but we'll skip this if the hash
+ * table looks like it'll exceed work_mem.
+
*/
+ if (can_hash && hashaggtablesize < work_mem * 1024L)

hash table size should be estimated only if can_hash is true.

3.
+ foreach(lc, grouped_rel->partial_pathlist)
+ {
+ Path *path =
(Path *) lfirst(lc);
+ double total_groups;
+
+ total_groups = path-
>parallel_degree * path->rows;
+
+ path = (Path *) create_gather_path(root, grouped_rel, path,
NULL,
+ &total_groups);

Do you need to perform it foreach partial path or just do it for
firstpartial path?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2016-03-17 12:36:36 Re: Proposal: Generic WAL logical messages
Previous Message David Rowley 2016-03-17 12:22:07 Re: Parallel Aggregate