Re: [HACKERS] Partition-wise aggregation/grouping

From: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Partition-wise aggregation/grouping
Date: 2018-02-01 13:59:49
Message-ID: CAM2+6=UpcC0cmmQhKrZaP7d+skRg21z706epMWJbcER_NvLK4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Thu, Feb 1, 2018 at 1:11 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Mon, Jan 29, 2018 at 3:42 AM, Jeevan Chalke
> <jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
> > Attached new patch set and rebased it on latest HEAD.
>
> I strongly dislike add_single_path_to_append_rel. It adds branches
> and complexity to code that is already very complex. Most
> importantly, why are we adding paths to fields in
> OtherUpperPathExtraData *extra instead of adding them to the path list
> of some RelOptInfo? If we had an appropriate RelOptInfo to which we
> could add the paths, then we could make this simpler.
>
> If I understand correctly, the reason you're doing it this way is
> because we have no place to put partially-aggregated, non-partial
> paths. If we only needed to worry about the parallel query case, we
> could just build an append of partially-aggregated paths for each
> child and stick it into the grouped rel's partial pathlist, just as we
> already do for regular parallel aggregation. There's no reason why
> add_paths_to_grouping_rel() needs to care about the difference a
> Partial Aggregate on top of whatever and an Append each branch of
> which is a Partial Aggregate on top of whatever. However, this won't
> work for non-partial paths, because add_paths_to_grouping_rel() needs
> to put paths into the grouped rel's pathlist -- and we can't mix
> together partially-aggregated paths and fully-aggregated paths in the
> same path list.
>

Yes.

>
> But, really, the way we're using grouped_rel->partial_pathlist right
> now is an awful hack. What I'm thinking we could do is introduce a
> new UpperRelationKind called UPPERREL_PARTIAL_GROUP_AGG, coming just
> before UPPERREL_GROUP_AGG. Without partition-wise aggregate,
> partially_grouped_rel's pathlist would always be NIL, and its partial
> pathlist would be constructed using the logic in
> add_partial_paths_to_grouping_rel, which would need renaming. Then,
> add_paths_to_grouping_rel would use paths from input_rel when doing
> non-parallel aggregation and paths from partially_grouped_rel when
> doing parallel aggregation. This would eliminate the ugly
> grouped_rel->partial_pathlist = NIL assignment at the bottom of
> create_grouping_paths(), because the grouped_rel's partial_pathlist
> would never have been (bogusly) populated in the first place, and
> hence would not need to be reset. All of these changes could be made
> via a preparatory patch.
>

I wrote a patch for this (on current HEAD) and attached separately here.
Please have a look.

I still not yet fully understand how we are going to pass those to the
add_paths_to_append_rel(). I need to look it more deeply though.

> Then the main patch needs to worry about four cases:
>
> 1. Parallel partition-wise aggregate, grouping key doesn't contain
> partition key. This should just be a matter of adding additional
> Append paths to partially_grouped_rel->partial_pathlist. The existing
> code already knows how to stick a Gather and FinalizeAggregate step on
> top of that, and I don't see why that logic would need any
> modification or addition. An Append of child partial-grouping paths
> should be producing the same output as a partial grouping of an
> Append, except that the former case might produce more separate groups
> that need to be merged; but that should be OK: we can just throw all
> the paths into the same path list and let the cheapest one win.
>

For any partial aggregation we need to add finalization step after we are
done with the APPEND i.e. post add_paths_to_append_rel(). Given that we
need to replicate the logic of sticking Gather and FinalizeAggregate step
at later stage. This is what exactly done in create_partition_agg_paths().
Am I missing something here?

> 2. Parallel partition-wise aggregate, grouping key contains partition
> key. For the most part, this is no different from case #1. We won't
> have groups spanning different partitions in this case, but we might
> have groups spanning different workers, so we still need a
> FinalizeAggregate step. As an exception, Gather -> Parallel Append ->
> [non-partial Aggregate path] would give us a way of doing aggregation
> in parallel without a separate Finalize step. I'm not sure if we want
> to consider that to be in scope for this patch. If we do, then we'd
> add the Parallel Append path to grouped_rel->partial_pathlist. Then,
> we could stick Gather (Merge) on top if it to produce a path for
> grouped_rel->pathlist using generate_gather_paths(); alternatively, it
> can be used by upper planning steps -- something we currently can't
> ever make work with parallel aggregation.
>
> 3. Non-parallel partition-wise aggregate, grouping key contains
> partition key. Build Append paths from the children of grouped_rel
> and add them to grouped_rel->pathlist.
>

Yes.

>
> 3. Non-parallel partition-wise aggregate, grouping key doesn't contain
> partition key. Build Append paths from the children of
> partially_grouped_rel and add them to partially_grouped_rel->pathlist.
> Also add code to generate paths for grouped_rel->pathlist by sticking
> a FinalizeAggregate step on top of each path from
> partially_grouped_rel->pathlist.
>

Yes, this is done in create_partition_agg_paths().
create_partition_agg_paths() basically adds gather path, if required and
then finalizes it again if required. These steps are similar to that of
add_paths_to_grouping_rel() counterpart which does gather + finalization.

> Overall, what I'm trying to argue for here is making this feature look
> less like its own separate thing and more like part of the general
> mechanism we've already got: partial paths would turn into regular
> paths via generate_gather_paths(), and partially aggregated paths
> would turn into fully aggregated paths by adding FinalizeAggregate.
> The existing special case that allows us to build a non-partial, fully
> aggregated path from a partial, partially-aggregated path would be
> preserved.
>
> I think this would probably eliminate some other problems I see in the
> existing design as well. For example, create_partition_agg_paths
> doesn't consider using Gather Merge, but that might be a win.

Append path is always non-sorted and has no pathkeys. Thus Gather Merge
over an Append path seems infeasible, isn't it?

> With
> the design above, I think you never need to call create_gather_path()
> anywhere. In case #1, the existing code takes care of it. In the
> special case mentioned under #2, if we chose to support that,
> generate_gather_paths() would take care of it. Both of those places
> already know about Gather Merge.
>

I don't understand how exactly, will have more careful look over this.

> On another note, I found preferFullAgg to be wicked confusing. To
> "prefer" something is to like it better, but be willing to accept
> other options if the preference can't be accommodated. Here, it seems
> like preferFullAgg = false prevents consideration of full aggregation.
> So it's really more like allowFullAgg, or, maybe better,
> try_full_aggregation. Also, try_partition_wise_grouping has a
> variable isPartialAgg which is always ends up getting set to
> !preferFullAgg. Having two Boolean variables which are always set to
> the opposite of each other isn't good. To add to the confusion, the
> code following the place where isPartialAgg is set sometimes refers to
> isPartialAgg and sometimes refers to preferFullAgg.
>

I will have a look over this and commenting part.

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Thanks

--
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Attachment Content-Type Size
partially_grouped_rel.patch text/x-patch 7.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-02-01 14:01:17 Re: JIT compiling with LLVM v9.0
Previous Message Andres Freund 2018-02-01 13:54:08 Re: ALTER TABLE ADD COLUMN fast default