Re: [HACKERS] Partition-wise aggregation/grouping

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Partition-wise aggregation/grouping
Date: 2018-03-01 21:52:59
Message-ID: CA+TgmobndViWApz3D4F9Vtkusm=rsc99XYtCYnfHOrzoVvrSCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 1, 2018 at 5:34 AM, Jeevan Chalke
<jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
> Attached new patchset after rebasing my changes over these changes and on
> latest HEAD.

+ * We have already created a Gather or Gather Merge path atop cheapest
+ * partial path. Thus the partial path referenced by the
Gather node needs
+ * to be preserved as adding new partial paths in same rel may
delete this
+ * referenced path. To do this we need to clear the
partial_pathlist from
+ * the partially_grouped_rel as we may add partial paths again
while doing
+ * partitionwise aggregation. Keeping older partial path intact seems
+ * reasonable too as it might possible that the final path
chosen which is
+ * using it wins, but the underneath partial path is not the
cheapest one.

This isn't a good design. You shouldn't create a Gather or Gather
Merge node until all partial paths have been added. I mean, the point
is to put a Gather node on top of the cheapest path, not the path that
is currently the cheapest but might not actually be the cheapest once
we've added them all.

+add_gather_or_gather_merge(PlannerInfo *root,

Please stop picking generic function names for functions that have
very specific purposes. I don't really think that you need this to be
a separate function at all, but it it is certainly NOT a
general-purpose function for adding a Gather or Gather Merge node.

+ /*
+ * Collect statistics about aggregates for estimating costs of
+ * performing aggregation in parallel or in partial, if not already
+ * done. We use same cost for all the children as they will be same
+ * anyways.
+ */

If it only needs to be done once, do we really have to have it inside
the loop? I see that you're using the varno-translated
partial_target->exprs and target->exprs, but if the specific varnos
don't matter, why not just use the untranslated version of the targets
before entering the loop? And if the specific varnos do matter, then
presumably you need to do it every time.

This is not a full review, but I'm out of time for right now.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2018-03-01 21:57:37 Re: jsonpath
Previous Message David Steele 2018-03-01 21:51:39 Re: row filtering for logical replication