Re: Parallel Aggregate

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Paul Ramsey <pramsey(at)cleverelephant(dot)ca>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Aggregate
Date: 2016-03-17 03:10:58
Message-ID: CAKJS1f-tpdR0qDi-gw2SN+TW2vbbO6M0W8A4NoZhcYxF+A0XDQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 17 March 2016 at 03:47, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> More review comments:
>
> /*
> + * Likewise for any partial paths, although this case
> is more simple as
> + * we don't track the cheapest path.
> + */
>
> I think in the process of getting rebased over the rapidly-evolving
> underlying substructure, this comment no longer makes much sense where
> it is in the file. IIUC, the comment is referring back to "Forcibly
> apply that target to all the Paths for the scan/join rel", but there's
> now enough other stuff in the middle that it doesn't really make sense
> any more. And actually, I think you should move the code up higher,
> not change the comment. This belongs before setting
> root->upper_targets[foo].

Yes, that's not where it was originally... contents may settle during transit...

> The logic in create_grouping_paths() is too ad-hoc and, as Amit and I
> have both complained about, wrong in detail because it doesn't call
> has_parallel_hazard anywhere. Basically, you have the wrong design.
> There shouldn't be any need to check parallelModeOK here. Rather,
> what you should be doing is setting consider_parallel to true or false
> on the upper rel. See set_rel_consider_parallel for how this is set
> for base relations, set_append_rel_size() for append relations, and
> perhaps most illustratively build_join_rel() for join relations. You
> should have some equivalent of this logic for upper rels, or at least
> the upper rels you care about:
>
> if (inner_rel->consider_parallel && outer_rel->consider_parallel &&
> !has_parallel_hazard((Node *) restrictlist, false))
> joinrel->consider_parallel = true;
>
> Then, you can consider parallel aggregation if consider_parallel is
> true and any other conditions that you care about are also met.
>
> I think that the way you are considering sorted aggregation, hashed
> aggregation, and mixed strategies does not make very much sense. It
> seems to me that what you should do is:
>
> 1. Figure out the cheapest PartialAggregate path. You will need to
> compare the costs of (a) a hash aggregate, (b) an explicit sort +
> group aggregate, and (c) grouping a presorted path. (We can
> technically skip (c) for right now since it can't occur.) I would go
> ahead and use add_partial_path() on these to stuff them into the
> partial_pathlist for the upper rel.
>
> 2. Take the first (cheapest) path in the partial_pathlist and stick a
> Gather node on top of it. Store this in a local variable, say,
> partial_aggregate_path.
>
> 3. Construct a finalize-hash-aggregate path for partial_aggregate_path
> and also a sort+finalize-group/plain-aggregate path for
> partial_aggregate_path, and add each of those to the upper rel. They
> will either beat out the non-parallel paths or they won't.
>
> The point is that the decision as to whether to use hashing or sorting
> below the Gather is completely independent from the choice of which
> one to use above the Gather. Pick the best strategy below the Gather;
> then pick the best strategy to stick on top of that above the Gather.

Good point. I've made local alterations to the patch so that partial
paths are now generated on the grouped_rel.
I also got rid of the enable_hashagg test in create_grouping_paths().
The use of this did seemed rather old school planner. Instead I
altered cost_agg() to add disable_cost appropriately, which simplifies
the logic of when and when not to consider hash aggregate paths. The
only downside of this that I can see is that the hash agg Path is
still generated when enable_hashagg is off, and that means a tiny bit
more work creating the path and calling add_path() for it.

This change also allows nodeGroup to be used for parallel aggregate
when there are no aggregate functions. This was a bit broken in the
old version.

> * This is very similar to make_group_input_target(), only we do not recurse
> * into Aggrefs. Aggrefs are left intact and added to the target list. Here we
> * also add any Aggrefs which are located in the HAVING clause into the
> * PathTarget.
> *
> * Aggrefs are also setup into partial mode and the partial return types are
> * set to become the type of the aggregate transition state rather than the
> * aggregate function's return type.
>
> This comment isn't very helpful because it tells you what the function
> does, which you can find out anyway from reading the code. What it
> should do is explain why it does it. Just to take one particular
> point, why would we not want to recurse into Aggrefs in this case?

Good point. I've updated it to:

/*
* make_partialgroup_input_target
* Generate appropriate PathTarget for input for Partial Aggregate nodes.
*
* Similar to make_group_input_target(), only we don't recurse into Aggrefs, as
* we need these to remain intact so that they can be found later in Combine
* Aggregate nodes during setrefs. Vars will be still pulled out of
* non-Aggref nodes as these will still be required by the combine aggregate
* phase.
*
* We also convert any Aggrefs which we do find and put them into partial mode,
* this adjusts the Aggref's return type so that the partially calculated
* aggregate value can make its way up the execution tree up to the Finalize
* Aggregate node.
*/

I will post an updated patch once I've addressed Amit's points.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message James Sewell 2016-03-17 03:13:14 Re: Parallel Aggregate
Previous Message Vitaly Burovoy 2016-03-17 03:02:24 Re: [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check