Re: [HACKERS] Partition-wise aggregation/grouping

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Partition-wise aggregation/grouping
Date: 2018-03-19 17:26:18
Message-ID: CA+TgmoZ+ZJTVad-=vEq393N99KTooxv9k7M+z73qnTAqkb49BQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Fri, Mar 16, 2018 at 1:50 PM, Ashutosh Bapat
<ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
> Ok. That looks good.

Here's an updated version. In this version, based on a voice
discussion with Ashutosh and Jeevan, I adjusted 0001 to combine it
with an earlier idea of splitting Gather/Gather Merge path generation
out of the function that creates partially aggregated paths. The idea
here is that create_ordinary_gather_paths() could first call
create_partial_grouping_paths(), then add additional paths which might
be partial or non-partial by invoking the partition-wise aggregate
logic, then call gather_grouping_paths() and set_cheapest() to
finalize the partially grouped rel. Also, I added draft commit
messages.

With this patch set applied, the key bit of logic in
create_ordinary_grouping_paths() ends up looking like this:

if (grouped_rel->consider_parallel && input_rel->partial_pathlist != NIL
&& (flags & GROUPING_CAN_PARTIAL_AGG) != 0)
{
partially_grouped_rel =
create_partial_grouping_paths(root,
grouped_rel,
input_rel,
gd,
can_sort,
can_hash,
&agg_final_costs);
gather_grouping_paths(root, partially_grouped_rel);
set_cheapest(partially_grouped_rel);
}

I imagine that what the main partition-wise aggregate patch would do
is (1) change the conditions under which
create_partial_grouping_paths() gets called, (2) postpone
gather_grouping_paths() and set_cheapest() until after partition-wise
aggregate had been done, doing them only if partially_grouped_rel !=
NULL. Partition-wise aggregate will need to happen before
add_paths_to_grouping_rel(), though, so that the latter function can
try a FinalizeAggregate node on top of an Append added by
partition-wise aggregate.

This is a bit strange, because it will mean that partition-wise
aggregate will be attempted BEFORE adding ordinary aggregate paths to
grouped_rel but AFTER adding them to partially_grouped_rel. We could
fix that by splitting add_paths_to_grouping_rel() into two functions,
one of which performs full aggregation directly and the other of which
tries finishing partial aggregation. I'm unsure that's a good idea
though: it would mean that we have very similar logic in two different
functions that could get out of sync as a result of future code
changes, and it's not really fixing any problem.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0003-Don-t-pass-the-grouping-target-around-unnecessarily.patch application/octet-stream 10.7 KB
0002-Determine-grouping-strategies-in-create_grouping_pat.patch application/octet-stream 9.0 KB
0001-Defer-creation-of-partially-grouped-relation-until-i.patch application/octet-stream 18.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2018-03-19 17:45:49 Re: [HACKERS] Partition-wise aggregation/grouping
Previous Message Jeremy Finzel 2018-03-19 17:10:36 found xmin from before relfrozenxid on pg_catalog.pg_authid