Re: [HACKERS] Partition-wise aggregation/grouping

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Partition-wise aggregation/grouping
Date: 2018-01-15 22:11:42
Message-ID: CA+Tgmoa+H1X8Te8JB9A+F=FkoY5xQNSmodkhW_qdm-C1SA3JXg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 11, 2018 at 6:00 AM, Jeevan Chalke
<jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
> Attached new set of patches adding this. Only patch 0007 (main patch) and
> 0008 (testcase patch) has changed.
>
> Please have a look and let me know if I missed any.

I spent a little time studying 0001 and 0002 today, as well as their
relation with 0007. I find the way that the refactoring has been done
slightly odd. With 0001 and 0002 applied, we end up with three
functions for creating aggregate paths: create_partial_agg_path, which
handles the partial-path case for both sort and hash;
create_sort_agg_path, which handles the sort case for non-partial
paths only; and create_hash_agg_path, which handles the hash case for
non-partial paths only. This leads to the following code in 0007:

+ /* Full grouping paths */
+
+ if (try_parallel_aggregation)
+ {
+ Assert(extra->agg_partial_costs &&
extra->agg_final_costs);
+ create_partial_agg_path(root, input_rel,
grouped_rel, target,
+
partial_target, extra->agg_partial_costs,
+
extra->agg_final_costs, gd, can_sort,
+
can_hash, (List *) extra->havingQual);
+ }
+
+ if (can_sort)
+ create_sort_agg_path(root, input_rel,
grouped_rel, target,
+
partial_target, agg_costs,
+
extra->agg_final_costs, gd, can_hash,
+
dNumGroups, (List *) extra->havingQual);
+
+ if (can_hash)
+ create_hash_agg_path(root, input_rel,
grouped_rel, target,
+
partial_target, agg_costs,
+
extra->agg_final_costs, gd, dNumGroups,
+ (List
*) extra->havingQual);

That looks strange -- you would expect to see either "sort" and "hash"
cases here, or maybe "partial" and "non-partial", or maybe all four
combinations, but seeing three things here looks surprising. I think
the solution is just to create a single function that does both the
work of create_sort_agg_path and the work of create_hash_agg_path
instead of having two separate functions.

A related thing that is also surprising is that 0007 manages to reuse
create_partial_agg_path for both the isPartialAgg and non-isPartialAgg
cases -- in fact, the calls appear to be identical, and could be
hoisted out of the "if" statement -- but create_sort_agg_path and
create_hash_agg_path do not get reused. I think you should see
whether you can define the new combo function that can be used for
both cases. The logic looks very similar, and I'm wondering why it
isn't more similar than it is; for instance, create_sort_agg_path
loops over the input rel's pathlist, but the code for
isPartialAgg/can_sort seems to consider only the cheapest path. If
this is correct, it needs a comment explaining it, but I don't see why
it should be correct.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chapman Flack 2018-01-15 22:19:27 Re: proposal: alternative psql commands quit and exit
Previous Message Tom Lane 2018-01-15 21:38:04 Re: [HACKERS] postgres_fdw bug in 9.6