Re: [HACKERS] Partition-wise aggregation/grouping

From: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Partition-wise aggregation/grouping
Date: 2018-03-07 14:30:37
Message-ID: CAM2+6=WBHSQF1nZnBJbktFDL9sqnueAL_KyzRx_3FW2kV11bXQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 6, 2018 at 4:59 PM, Ashutosh Bapat <
ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:

> Hi Jeevan,
> I am back reviewing this. Here are some comments.
>
> @@ -1415,7 +1413,8 @@ add_paths_to_append_rel(PlannerInfo *root,
> RelOptInfo *rel,
> * the unparameterized Append path we are constructing for the
> parent.
> * If not, there's no workable unparameterized path.
> */
> - if (childrel->cheapest_total_path->param_info == NULL)
> + if (childrel->pathlist != NIL &&
> + childrel->cheapest_total_path->param_info == NULL)
> accumulate_append_subpath(childrel->cheapest_total_path,
> &subpaths, NULL);
> else
> @@ -1683,6 +1682,13 @@ add_paths_to_append_rel(PlannerInfo *root,
> RelOptInfo *rel,
> RelOptInfo *childrel = (RelOptInfo *) lfirst(lcr);
> Path *subpath;
>
> + if (childrel->pathlist == NIL)
> + {
> + /* failed to make a suitable path for this child */
> + subpaths_valid = false;
> + break;
> + }
> +
> When can childrel->pathlist be NIL?
>

Done. Sorry it was leftover from my earlier trial. Not needed now. Removed.

>
> diff --git a/src/backend/optimizer/plan/createplan.c
> b/src/backend/optimizer/plan/createplan.c
> index 9ae1bf3..f90626c 100644
> --- a/src/backend/optimizer/plan/createplan.c
> +++ b/src/backend/optimizer/plan/createplan.c
> @@ -1670,7 +1670,15 @@ create_sort_plan(PlannerInfo *root, SortPath
> *best_path, int flags)
> subplan = create_plan_recurse(root, best_path->subpath,
> flags | CP_SMALL_TLIST);
>
> - plan = make_sort_from_pathkeys(subplan, best_path->path.pathkeys,
> NULL);
> + /*
> + * In find_ec_member_for_tle(), child EC members are ignored if they
> don't
> + * belong to the given relids. Thus, if this sort path is based on a
> child
> + * relation, we must pass the relids of it. Otherwise, we will end-up
> into
> + * an error requiring pathkey item.
> + */
> + plan = make_sort_from_pathkeys(subplan, best_path->path.pathkeys,
> + IS_OTHER_REL(best_path->subpath->parent)
> ?
> + best_path->path.parent->relids : NULL);
>
> copy_generic_path_info(&plan->plan, (Path *) best_path);
>
> Please separate this small adjustment in a patch of its own, with some
> explanation of why we need it i.e. now this function can see SortPaths from
> child (other) relations.
>

I am not sure whether it is good to split this out of the main patch. Main
patch exposes this requirement and thus seems better to have these changes
in main patch itself.
However, I have no issues in extracting it into a separate small patch. Let
me know your views.

>
> + if (child_data)
> + {
> + /* Must be other rel as all child relations are marked OTHER_RELs
> */
> + Assert(IS_OTHER_REL(input_rel));
>
> I think we should check IS_OTHER_REL() and Assert(child_data). That way we
> know
> that the code in the if block is executed for OTHER relation.
>

Done.

>
> - if ((root->hasHavingQual || parse->groupingSets) &&
> + if (!child_data && (root->hasHavingQual || parse->groupingSets) &&
>
> Degenerate grouping will never see child relations, so instead of checking
> for
> child_data, Assert (!IS_OTHER_REL()) inside this block. Add a comment there
> explaining the assertion.
>

Done.

>
> + *
> + * If we are performing grouping for a child relation, fetch can_sort
> from
> + * the child_data to avoid re-calculating same.
> */
> - can_sort = (gd && gd->rollups != NIL)
> - || grouping_is_sortable(parse->groupClause);
> + can_sort = child_data ? child_data->can_sort : ((gd &&
> gd->rollups != NIL) ||
> +
> grouping_is_sortable(parse->groupClause));
>
> Instead of adding a conditional here, we can compute these values before
> create_grouping_paths() is called from grouping_planner() and then pass
> them
> down to try_partitionwise_grouping(). I have attached a patch which
> refactors
> this code. Please see if this refactoring is useful. In the attached
> patch, I
> have handled can_sort, can_hash and partial aggregation costs. More on the
> last
> component below.
>
>
> /*
> * Figure out whether a PartialAggregate/Finalize Aggregate execution
> @@ -3740,10 +3789,8 @@ create_grouping_paths(PlannerInfo *root,
> * partial paths for partially_grouped_rel; that way, later code can
> * easily consider both parallel and non-parallel approaches to
> grouping.
> */
> - if (try_parallel_aggregation)
> + if (!child_data && !(agg_costs->hasNonPartial ||
> agg_costs->hasNonSerial))
> {
> - PathTarget *partial_grouping_target;
> -
> [... clipped ...]
> + get_agg_clause_costs(root, havingQual,
> AGGSPLIT_FINAL_DESERIAL,
> - &agg_final_costs);
> + agg_final_costs);
> }
> + }
>
> With this change, we are computing partial aggregation costs even in
> the cases when
> those will not be used e.g. when there are no children and parallel paths
> can
> not be created. In the attached patch, I have refactored the code such that
> they are computed when they are needed the first time and re-used later.
>
> + if (child_data)
> + {
> + partial_grouping_target = child_data->partialRelTarget;
> + partially_grouped_rel->reltarget = partial_grouping_target;
> + agg_partial_costs = child_data->agg_partial_costs;
> + agg_final_costs = child_data->agg_final_costs;
> + }
>
> I think, with the refactoring, we can get rid of the last two lines here. I
> think we can get rid of this block entirely, but I have not reviewed the
> entire
> code to confirm that.
>

I have added your patch as one of the refactoring patch and rebased my
changes over it.
Yes, it removed this block and other few conditions too.

>
> static PathTarget *
> -make_partial_grouping_target(PlannerInfo *root, PathTarget
> *grouping_target)
> +make_partial_grouping_target(PlannerInfo *root,
> + PathTarget *grouping_target,
> + Node *havingQual)
> This looks like a refactoring change. Should go to one of the refactoring
> patches or in a patch of its own.
>

OK. Refactored into separate patch.

Will post a new patchset with these changes included.

>
> This isn't full review. I will continue reviewing this further.
>

Sure.

>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>

--
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikita Glukhov 2018-03-07 14:33:10 Re: jsonpath
Previous Message David Steele 2018-03-07 14:25:37 Re: [HACKERS] Subscription code improvements