Re: [HACKERS] Partition-wise aggregation/grouping

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Partition-wise aggregation/grouping
Date: 2018-01-16 19:48:38
Message-ID: CA+TgmobKjnCrwwSXNDsGLJW8q+RRaQ8956GcfyjCMXiDCiBKqg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 16, 2018 at 3:56 AM, Jeevan Chalke
<jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
> I will make your suggested changes that is merge create_sort_agg_path() and
> create_hash_agg_path(). Will name that function as
> create_sort_and_hash_agg_paths().

I suggest add_paths_to_grouping_rel() and
add_partial_paths_to_grouping_rel(), similar to what commit
c44c47a773bd9073012935a29b0264d95920412c did with
add_paths_to_append_rel().

> Oops. My mistake. Missed. We should loop over the input rel's pathlist.
>
> Yep. With above change, the logic is very similar except
> (1) isPartialAgg/can_sort case creates the partial paths and
> (2) finalization step is not needed at this stage.

I'm not sure what you mean by #1.

> I think it can be done by passing a flag to create_sort_agg_path() (or new
> combo function) and making appropriate adjustments. Do you think addition of
> this new flag should go in re-factoring patch or main PWA patch?
> I think re-factoring patch.

I think the refactoring patch should move the existing code into a new
function without any changes, and then the main patch should add an
additional argument to that function that allows for either behavior.

By the way, I'm also a bit concerned about this:

+ /*
+ * For full aggregation, we are done with the partial
paths. Just
+ * clear it out so that we don't try to create a
parallel plan over it.
+ */
+ grouped_rel->partial_pathlist = NIL;

I think that's being done for the same reason as mentioned at the
bottom of the current code for create_grouping_paths(). They are only
partially aggregated and wouldn't produce correct final results if
some other planning step -- create_ordered_paths, or the code that
sets up final_rel -- used them as if they had been fully agggregated.
I'm worried that there might be an analogous danger for partition-wise
aggregation -- that is, that the paths being inserted into the partial
pathlists of the aggregate child rels might get reused by some later
planning step which doesn't realize that the output they produce
doesn't quite match up with the rel to which they are attached. You
may have already taken care of that problem somehow, but we should
make sure that it's fully correct and clearly commented. I don't
immediately see why the isPartialAgg case should be any different from
the !isPartialAgg case.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2018-01-16 19:51:13 Re: let's not complain about harmless patch-apply failures
Previous Message David Fetter 2018-01-16 19:31:28 Re: let's not complain about harmless patch-apply failures