Re: [HACKERS] Partition-wise aggregation/grouping

From: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Partition-wise aggregation/grouping
Date: 2018-01-18 13:55:17
Message-ID: CAM2+6=UF7knvAUed59G-4HWJC+WewooRe23x5Egp1Sm7Poi4WQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 17, 2018 at 1:18 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Tue, Jan 16, 2018 at 3:56 AM, Jeevan Chalke
> <jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
> > I will make your suggested changes that is merge create_sort_agg_path()
> and
> > create_hash_agg_path(). Will name that function as
> > create_sort_and_hash_agg_paths().
>
> I suggest add_paths_to_grouping_rel() and
> add_partial_paths_to_grouping_rel(), similar to what commit
> c44c47a773bd9073012935a29b0264d95920412c did with
> add_paths_to_append_rel().
>
> > Oops. My mistake. Missed. We should loop over the input rel's pathlist.
> >
> > Yep. With above change, the logic is very similar except
> > (1) isPartialAgg/can_sort case creates the partial paths and
> > (2) finalization step is not needed at this stage.
>
> I'm not sure what you mean by #1.
>

I mean, in case of isPartialAgg=true, we need to create a partial
aggregation path which has aggsplit=AGGSPLIT_INITIAL_SERIAL and should not
perform finalization at this stage. And thus add_paths_to_grouping_rel()
needs a flag to diffrentiate it. By adding that code chunk allows us to
reuse same function in both the cases i.e full and partial aggregation.

Attached patch with other review points fixed.

> > I think it can be done by passing a flag to create_sort_agg_path() (or
> new
> > combo function) and making appropriate adjustments. Do you think
> addition of
> > this new flag should go in re-factoring patch or main PWA patch?
> > I think re-factoring patch.
>
> I think the refactoring patch should move the existing code into a new
> function without any changes, and then the main patch should add an
> additional argument to that function that allows for either behavior.
>
> By the way, I'm also a bit concerned about this:
>
> + /*
> + * For full aggregation, we are done with the partial
> paths. Just
> + * clear it out so that we don't try to create a
> parallel plan over it.
> + */
> + grouped_rel->partial_pathlist = NIL;
>
> I think that's being done for the same reason as mentioned at the
> bottom of the current code for create_grouping_paths(). They are only
> partially aggregated and wouldn't produce correct final results if
> some other planning step -- create_ordered_paths, or the code that
> sets up final_rel -- used them as if they had been fully agggregated.
> I'm worried that there might be an analogous danger for partition-wise
> aggregation -- that is, that the paths being inserted into the partial
> pathlists of the aggregate child rels might get reused by some later
> planning step which doesn't realize that the output they produce
> doesn't quite match up with the rel to which they are attached. You
> may have already taken care of that problem somehow, but we should
> make sure that it's fully correct and clearly commented. I don't
> immediately see why the isPartialAgg case should be any different from
> the !isPartialAgg case.
>

Actually I needed this because in case of full aggregation we already build
all final aggregation paths before performing Append operation. However,
when we do append in add_paths_to_append_rel(), it thinks that
partial_pathlist present in grouped_rel is finalized one exactly like you
mentioned above and thus we need to clear it out.

But yes, for safer side, I think once we done with partition-wise
aggregation step, we need to again go through the partitioning chain and
need to clear out all child grouped rel's partial_pathlist for the reason
mentioned at the bottom of the current code for create_grouping_paths().

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

--
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Attachment Content-Type Size
partition-wise-agg-v11.tar.gz application/x-gzip 37.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Anastasia Lubennikova 2018-01-18 13:57:06 Re: [HACKERS] WIP: Covering + unique indexes.
Previous Message Michael Paquier 2018-01-18 13:35:47 Re: [HACKERS] REL9_6_STABLE - a minor bug in src/common/exec.c