Re: Aggregate Push Down - Performing aggregation on foreign server

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Aggregate Push Down - Performing aggregation on foreign server
Date: 2016-09-08 17:11:01
Message-ID: CAFjFpRdYYRB2AWOX3cR=TtE4tYqzNAfUKVie5LcHpY6AwY9=Zg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> While checking for shippability, we build the target list which is passed
> to
> the foreign server as fdw_scan_tlist. The target list contains
> a. All the GROUP BY expressions
> b. Shippable entries from the target list of upper relation
> c. Var and Aggref nodes from non-shippable entries from the target list of
> upper relation
>

The code in the patch doesn't seem to add Var nodes explicitly. It assumes
that
the Var nodes will be part of GROUP BY clause. The code is correct, I think.

> d. Var and Aggref nodes from non-shippable HAVING conditions.
>

This needs to be changed as per the comments below.

>
> Costing:
>
> If use_foreign_estimate is true for input relation, like JOIN case, we use
> EXPLAIN output to get the cost of query with aggregation/grouping on the
> foreign server. If not we calculate the costs locally. Similar to core, we
> use
> get_agg_clause_costs() to get costs for aggregation and then using logic
> similar to cost_agg() we calculate startup and total cost. Since we have no
> idea which aggregation strategy will be used at foreign side, we add all
> startup cost (startup cost of input relation, aggregates etc.) into startup
> cost for the grouping path and similarly for total cost.
>

This looks OK to me.

>
> Deparsing the query:
>
> Target list created while checking for shippability is deparsed using
> deparseExplicitTargetList(). sortgroupref are adjusted according to this
> target list. Most of the logic to deparse an Aggref is inspired from
> get_agg_expr(). For an upper relation, FROM and WHERE clauses come from the
> underlying scan relation and thus for simplicity, FROM clause deparsing
> logic
> is moved from deparseSelectSql() to a new function deparseFromClause(). The
> same function adds WHERE clause to the remote SQL.
>

Ok.

>
>
> Area of future work:
>
> 1. Adding path with path-keys to push ORDER BY clause along with
> aggregation/
> grouping. Should be supported as a separate patch.
>
> 2. Grouping Sets/Rollup/Cube is not supported in current version. I have
> left
> this aside to keep patch smaller. If required I can add that support in the
> next version of the patch.
>

I am fine with these limitations for first cut of this feature.

I think we should try to measure performance gain because of aggregate
pushdown. The EXPLAIN
doesn't show actual improvement in the execution times.

Here are the comments on the patch.

Patch compiles without errors/warnings - Yes
make check passes - Yes.
make check in contrib/postgres_fdw passes - Yes

Attached patch (based on your patch) has some typos corrected, some comments
rephrased. It also has some code changes, as explained in various comments
below. Please see if those look good.

1. Usually for any deparseSomething function, Something is the type of node
produced by the parser when the string output by that function is parsed.
deparseColumnRef, for example, produces a string, which when parsed
produces a
columnRef node. There is are nodes of type FromClause and AggOrderBy. I
guess,
you meant FromExpr instead of FromClause. deparseAggOrderBy() may be
renamed as
appendOrderByFromList() or something similar. It may be utilized for window
functions if required.

2. Can we infer the value of foragg flag from the RelOptInfo passed to
is_foreign_expr()? For any upper relation, it is ok to have aggregate in
there. For any other relation aggregate is not expected.

3. In function foreign_grouping_ok(), should we use classifyConditions()?
The
function is written and used for base relations. There's nothing in that
function, which prohibits it being used for other relations. I feel that
foreign_join_ok() should have used the same function to classify the other
clauses.

4. May be the individual criteria in the comment block
+ /*
+ * Aggregate is safe to pushdown if
+ * 1. It is a built-in aggregate
+ * 2. All its arguments are safe to push-down
+ * 3. The functions involved are immutable.
+ * 4. Other expressions involved like aggorder,
aggdistinct are
+ * safe to be pushed down.
+ */
should be associated with the code blocks which implement those. As the
criteria change it's difficult to maintain the numbered list in sync with
the
code.

5. The comment
+ /* Aggregates other than simple one are non-pushable. */
should read /* Only non-split aggregates are pushable. */ as AGGSPLIT_SIMPLE
means a complete, non-split aggregation step.

6. An aggregate function has transition, combination and finalization
function
associated with it. Should we check whether all of the functions are
shippable?
But probably it suffices to check whether aggregate function as whole is
shippable or not using is_shippable() since it's the whole aggregate we are
interested in and not the intermediate results. Probably, we should add a
comment explaining why it's sufficient to check the aggregate function as a
whole. I wondered whether we should check shippability of aggregate return
type, but we don't do that for functions as well. So it's fine.

7. It looks like aggdirectargs, aggdistinct, aggorder expressions are all
present in args list. If all the expressions in args are shippable, it means
those lists are also shippable and hence corresponding aggregate subclauses
are
shippable. Are we unnecessarily checking shippability of those other lists?

8. The prologue of build_tlist_to_deparse() mentions that the output
targetlist
contains columns, which is not true with your patch. The targetlist
returned by
this function can have expressions for upper relations. Please update the
prologue to reflect this change.

9. In build_tlist_to_deparse(), we are including aggregates from unshippable
conditions without checking whether they are shippable or not. This can
cause
an unshippable expression to be pushed to the foreign server as seen below

explain verbose select avg(c1) from ft1 having avg(c1 * random()) > 100;
QUERY
PLAN
--------------------------------------------------------------------------------------
Foreign Scan (cost=112.50..133.53 rows=1 width=32)
Output: (avg(c1))
Filter: ((avg(((ft1.c1)::double precision * random()))) > '100'::double
precision)
Relations: Aggregate on (public.ft1)
Remote SQL: SELECT avg(r1."C 1"), avg((r1."C 1" * random())) FROM "S
1"."T 1" r1
(5 rows)

We should check shippability of aggregates in unshippable conditions in
foreign_grouping_ok(). If any of those are not shippable, aggregation can
not
be pushed down. Otherwise, include those in the grouped_tlist.

10. Comment /* Construct FROM clause */ should read "Construct FROM and
WHERE
clauses." to be in sync with the next function call. deparseFromClause()
should
be renamed as deparseFromExpr() inline with the naming convention of
deparse<something> functions. From the function signature of
deparseFromClause(), it doesn't become clear as to what contributes to the
FROM
clause. Should we add a comment in the prologue of that function explaining
the
same. Also it strikes odd that for an upper relation, we pass remote_conds
to
this function, but it doesn't deparse those but deparses the remote
conditions
from the scan relation and later deparses the same remote_conds as HAVING
clause. Although this is correct, the code looks odd. May be the codeblock
in
deparseFuncClause()
1008 /*
1009 * For aggregates the FROM clause will be build from underneath
scan rel.
1010 * WHERE clause conditions too taken from there.
1011 */
1012 if (foreignrel->reloptkind == RELOPT_UPPER_REL)
1013 {
1014 PgFdwRelationInfo *ofpinfo;
1015
1016 ofpinfo = (PgFdwRelationInfo *) fpinfo->outerrel->fdw_private;
1017 scan_rel = fpinfo->outerrel;
1018 context->foreignrel = scan_rel;
1019 remote_conds = ofpinfo->remote_conds;
1020 }
should be moved into deparseSelectStmtForRel() to make the things clear.

11. Whether to qualify a column name by an alias should be based on whether
the
underlying scan relation is a join or base relation scan. That can be
achieved
by setting scanrel in the deparse_context. Attached patch has this
implemented.
Instead of scanrel, we may also have use_alias value as part of the context,
which is used everywhere.

12. The code to print the function name in deparseAggref() seems similar to
that in deparseFuncExpr(). Should we extract it out into a separate function
and call that function in deparseAggref() and deparseFuncExpr() both?

13. While deparsing the direct aggregates, the code is not adding ","
between
two arguments, resulting in error like below.
select rank(c1, c2) within group (order by c1, c2) from ft1 group by c1, c2;
ERROR: syntax error at or near "c2"
CONTEXT: Remote SQL command: SELECT rank("C 1"c2) WITHIN GROUP (ORDER BY "C
1", c2), "C 1", c2 FROM "S 1"."T 1" GROUP BY "C 1", c2
May be we should add support to deparse List expressions by separating list
items by "," similar to get_rule_expr() and use that here.

14. In deparseAggOrderBy() we do not output anything for default cases like
ASC
or NULLS FIRST (for DESC). But like appendOrderByClause() it's better to be
explicit and output even the default specifications. That way, we do not
leave
anything to the interpretation of the foreign server.

15. deparseAggOrderBy supports deparsing USING subclause for ORDER BY for an
aggregate, but there is no corresponding shippability check for ordering
operator. Also, we should try to produce a testcase for the same.

16. The code to deparse a sort/group reference in deparseAggOrderBy() and
appendGroupByClause() looks similar. Should we extract it out into a
function
by itself and call that function in those two places similar to
get_rule_sortgroupclause()?

17. In postgresGetForeignPlan() the code to extract conditions and
targetlist
is duplicated for join and upper relation. Attached patch removes this
duplication. Since DML, FOR SHARE/UPDATE is not allowed with aggregation and
grouping, we should get an outer plan in that code.

18. estimate_path_cost_size() has grown quite long (~400 lines). Half of
that
function deals with estimating costs locally. Should we split this function
into smaller functions, one for estimating size and costs of each kind of
relations locally?

19. Condition below
+ if (sgref && query->groupClause && query->groupingSets == NIL &&
+ get_sortgroupref_clause_noerr(sgref, query->groupClause) !=
NULL)
can be rewritten as
if (sgref && get_sortgroupref_clause_noerr(sgref, query->groupClause))
since we won't come here with non-NULL query->groupingSets and
get_sortgroupref_clause_noerr() will return NULL, if there aren't any
groupClause.

20. Please use word "foreign" instead of "remote" in comments.

21. This code looks dubious
+ /* If not GROUP BY ref, reset it as we are not pushing those */
+ if (sgref)
+ grouping_target->sortgrouprefs[i] = 0;
grouping_target comes from RelOptInfo, which is being modified here. We
shouldn't be modifying anything in the RelOptInfo while checking whether the
aggregate/grouping is shippable. You may want to copy the pathtaget and
modify
the copy.

22. Following comment needs more elaboration.
+ /*
+ * Add aggregates, if any, into tlist. Plain Var nodes
pulled
+ * are already part of GROUP BY and thus no need to add
them
+ * explicitly.
+ */
Plain Var nodes will either be same as some GROUP BY expression or should be
part of some GROUP BY expression. In the later case, the query can not refer
those without the surrounding expression. which will be part of the
targetlist
list. Hence we don't need to add plain Var nodes in the targetlist. In fact
adding those in SELECT clause will cause error on the foreign server if they
are not part of GROUP BY clause.

23. Probably you don't need to save this in fpinfo. You may want to
calculate
it whenever it's needed.
+ fpinfo->grouped_exprs = get_sortgrouplist_exprs(query->groupClause,
tlist);

24. Can this ever happen for postgres_fdw? The core sets fdwroutine and
calls
this function when the underlying scan relation has fdwroutine set, which
implies that it called corresponding GetForeignPath hooks, which should have
set fdw_private. Nonetheless, I think, still the condition is useful to
avoid
crashing the server in case the fdw_private is not set. But we need better
comments.
+ /* If input rel is not aware of fdw, simply return */
+ if (!input_rel->fdw_private)
+ return;

25. Although it's desirable to get a switch case in
postgresGetForeignUpperPaths() eventually, for this patch I guess we should
have an if condition there.

26. Attached patch has restructured code in appendGroupByClause() to reduce
indentation and make the code readable. Let me know if this looks good to
you.

27. Prologue of create_foreign_grouping_paths() does not have formatting
similar to the surrounding functions. Please fix it. Since the function
"create"s and "add"s paths, it might be better to rename it as
add_foreign_grouping_paths().

28. Isn't the following true for any upper relation and shouldn't it be
part of
postgresGetForeignUpperPaths(), rather than create_foreign_grouping_paths()?
+ /*
+ * If input rel is not safe to pushdown, we cannot do grouping and/or
+ * aggregation on it.
+ */
+ if (!ifpinfo || !ifpinfo->pushdown_safe)
+ return;

29. We require RelOptInfo::relids since create_foreignscan_plan() copies
them
into ForeignPlan::fs_relids and executor uses them. So, I guess, we have to
set
those in the core somewhere. May be while calling fetch_upper_rel() in
grouping_planner(), we should call it with relids of the underlying scan
relation. I don't see any function calling fetch_upper_rel() with non-NULL
relids. In fact, if we are to create an upper relation with relids in
future,
this code is going to wipe that out, which will be undesirable. Also,
generally, when copying relids, it's better to use bms_copy() to make a copy
of Bitmapset, instead of just assigning the pointer.

30. By the time postgresGetForeignUpperPaths() gets called, the core has
already added its own paths, so it doesn't make much sense to set rows and
width grouped_rel in create_foreign_grouping_paths().

31. fpinfo->server and user fields are being set twice, once in
create_foreign_grouping_paths() then in foreign_grouping_ok(). Do we need
that?

32.
+ /*
+ * Do we want to create paths with pathkeys corresponding for
+ * root->query_pathkeys.
+ */
Yes, it would be desirable to do that. If we are not going to do that in
this
patch, may be remove that line or add a TODO kind of comment.

33. The patch marks build_path_tlist() as non-static but does not use it
anywhere outside creatplan.c. Is this change intentional?

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachment Content-Type Size
pg_agg_pushdown_v1_extras.patch text/x-patch 57.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-09-08 17:13:21 Re: Parallel tuplesort (for parallel B-Tree index creation)
Previous Message Tom Lane 2016-09-08 16:52:11 Re: Default make target in test modules