Re: [HACKERS] Partition-wise aggregation/grouping

From: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Partition-wise aggregation/grouping
Date: 2018-02-26 11:38:30
Message-ID: CAM2+6=U-B5AOGmU0s8sRVYFqoHq+fXF2FsmUVexCMF0VA3D5gw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Robert,

On Fri, Feb 23, 2018 at 2:53 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Thu, Feb 8, 2018 at 8:05 AM, Jeevan Chalke
> <jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
> > In this attached version, I have rebased my changes over new design of
> > partially_grouped_rel. The preparatory changes of adding
> > partially_grouped_rel are in 0001.
>
> I spent today hacking in 0001; results attached. The big change from
> your version is that this now uses generate_gather_paths() to add
> Gather/Gather Merge nodes (except in the case where we sort by group
> pathkeys and then Gather Merge) rather than keeping all of the bespoke
> code. That turned up to be a bit less elegant than I would have liked
> -- I had to an override_rows argument to generate_gather_paths to make
> it work. But overall I think this is still a big improvement, since
> it lets us share code instead of duplicating it. Also, it potentially
> lets us add partially-aggregated but non-parallel paths into
> partially_grouped_rel->pathlist and that should Just Work; they will
> get the Finalize Aggregate step but not the Gather. With your
> arrangement that wouldn't work.
>
> Please review/test.
>

I have reviewed and tested the patch and here are my couple of points:

/*
- * If the input rel belongs to a single FDW, so does the grouped rel.
+ * If the input rel belongs to a single FDW, so does the grouped rel.
Same
+ * for the partially_grouped_rel.
*/
grouped_rel->serverid = input_rel->serverid;
grouped_rel->userid = input_rel->userid;
grouped_rel->useridiscurrent = input_rel->useridiscurrent;
grouped_rel->fdwroutine = input_rel->fdwroutine;
+ partially_grouped_rel->serverid = input_rel->serverid;
+ partially_grouped_rel->userid = input_rel->userid;
+ partially_grouped_rel->useridiscurrent = input_rel->useridiscurrent;
+ partially_grouped_rel->fdwroutine = input_rel->fdwroutine;

In my earlier mail where I have posted a patch for this partially grouped
rel changes, I forgot to put my question on this.
I was unclear about above changes and thus passed grouped_rel whenever we
wanted to work on partially_grouped_rel to fetch relevant details.

One idea I thought about is to memcpy the struct once we have set all
required fields for grouped_rel so that we don't have to do similar stuff
for partially_grouped_rel.

---

+ * Insert a Sort node, if required. But there's no point in
+ * sorting anything but the cheapest path.
*/
- if (root->group_pathkeys)
+ if (!pathkeys_contained_in(root->group_pathkeys,
path->pathkeys))
+ {
+ if (path != linitial(partially_grouped_rel->pathlist))
+ continue;

Paths in pathlist are added by add_path(). Though we have paths is pathlist
is sorted with the cheapest total path, we generally use
RelOptInfo->cheapest_total_path instead of using first entry, unlike
partial paths. But here you use the first entry like partial paths case.
Will it better to use cheapest total path from partially_grouped_rel? This
will require calling set_cheapest on partially_grouped_rel before we call
this function.

Attached top-up patch doing this along with few indentation fixes.

Rest of the changes look good to me.

Once this gets in, I will re-base my other patches accordingly.

And, thanks for committing 0006.

Thanks

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

--
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Attachment Content-Type Size
use_cheapest_total_path.patch text/x-patch 2.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2018-02-26 11:50:05 Optimizing nested ConvertRowtypeExpr execution
Previous Message Thomas Munro 2018-02-26 10:15:44 Re: Unexpected behavior with transition tables in update statement trigger