Re: [HACKERS] Partition-wise aggregation/grouping

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Partition-wise aggregation/grouping
Date: 2018-03-20 20:34:39
Message-ID: CA+Tgmoak5m6UKdKN9R-uoiGuCS6K362W1+of0ZiOS8-i22nxnw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 20, 2018 at 10:46 AM, Jeevan Chalke
<jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
> I have added all these three patches in the attached patch-set and rebased
> my changes over it.
>
> However, I have not yet made this patch-set dependednt on UPPERREL_TLIST
> changes you have proposed on another mail-thread and thus it has 0001 patch
> refactoring the scanjoin issue.
> 0002, 0003 and 0004 are your patches added in this patchset.
> 0005 and 0006 are further refactoring patches. 0006 adds a
> GroupPathExtraData which stores mostly child variant data and costs.
> 0007 is main partitionwise aggregation patch which is then rebased
> accordingly.
> 0008 contains testcase and 0009 contains FDW changes.

Committed my refactoring patches (your 0002-0004).

Regarding apply_scanjoin_target_to_paths in 0001 and 0007, it seems
like what happens is: we first build an Append path for the topmost
scan/join rel. That uses paths from the individual relations that
don't necessarily produce the final scan/join target. Then we mutate
those relations in place during partition-wise aggregate so that they
now do produce the final scan/join target and generate some more paths
using the results. So there's an ordering dependency, and the same
pathlist represents different things at different times. That is, I
suppose, not technically any worse than what we're doing for the
scan/join rel's pathlist in general, but here there's the additional
complexity that the paths get used both before and after being
mutated. The UPPERREL_TLIST proposal would clean this up, although I
realize that has unresolved issues.

In create_partial_grouping_paths, the loop that does "for (i = 0; i <
2; i++)" is not exactly what I had in mind when I said that we should
use two loops. I did not mean a loop with two iterations. I meant
adding a loop like foreach(lc, input_rel->pathlist) in each place
where we currently have a loop like
foreach(input_rel->partial_pathlist). See 0001, attached.

Don't write if (a) Assert(b) but rather Assert(!a || b). See 0002, attached.

In the patch as proposed, create_partial_grouping_paths() can get
called even if GROUPING_CAN_PARTIAL_AGG is not set. I think that's
wrong. If can_partial_agg() isn't accurately determining whether
partial aggregation is possible, and as Ashutosh and I have been
discussing, there's room for improvement in that area, then that's a
topic for some other set of patches. Also, the test in
create_ordinary_grouping_paths for whether or not to call
create_partial_grouping_paths() is super-complicated and uncommented.
I think a simpler approach is to allow create_partial_grouping_paths()
the option of returning NULL. See 0003, attached.

make_grouping_rel() claims that "For now, all aggregated paths are
added to the (GROUP_AGG, NULL) upperrel", but this is false: we no
longer have only one grouped upper rel.

I'm having a heck of a time understanding what is_partial_aggregation
and perform_partial_partitionwise_aggregation are supposed to be
doing. It seems like is_partial_aggregation means that we should ONLY
do partial aggregation, which is not exactly what the name implies.
It also seems like perform_partial_partitionwise_aggregation and
is_partial_aggregation differ only in that they apply to the current
level and to the child level respectively; can't we merge these
somehow so that we don't need both of them?

I think that if the last test in can_partitionwise_grouping were moved
before the previous test, it could be simplified to test only
(extra->flags & GROUPING_CAN_PARTIAL_AGG) == 0 and not
*perform_partial_partitionwise_aggregation.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0003-Allow-create_partial_grouping_paths-to-return-NULL.patch application/octet-stream 2.5 KB
0002-Fix-Assert-style.patch application/octet-stream 1020 bytes
0001-Remove-loop-from-0-to-1.patch application/octet-stream 10.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2018-03-20 20:44:45 Add "docs" to top-level Makefile for non-GNU make?
Previous Message Tomas Vondra 2018-03-20 20:30:15 Re: [PROPOSAL] Shared Ispell dictionaries