Re: Aggregate Push Down - Performing aggregation on foreign server

From: Prabhat Sahu <prabhat(dot)sahu(at)enterprisedb(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Aggregate Push Down - Performing aggregation on foreign server
Date: 2016-09-12 06:50:45
Message-ID: CANEvxPokcFi7OfEpi3=J+mvxu+PPAG2zXASLEkv5DzDPhSTk6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

While testing "Aggregate pushdown", i found the below error:
-- GROUP BY alias showing different behavior after adding patch.

-- Create table "t1", insert few records.
create table t1(c1 int);
insert into t1 values(10), (20);

-- Create foreign table:
create foreign table f_t1 (c1 int) server db1_server options (table_name
't1');

-- with local table:
postgres=# select 2 a, avg(c1) from t1 group by a;
a | avg
---+---------------------
2 | 15.0000000000000000
(1 row)

-- with foreign table:
postgres=# select 2 a, avg(c1) from f_t1 group by a;
ERROR: aggregate functions are not allowed in GROUP BY
CONTEXT: Remote SQL command: EXPLAIN SELECT 2, avg(c1) FROM public.t1
GROUP BY 2

On Thu, Sep 8, 2016 at 10:41 PM, Ashutosh Bapat <
ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:

>
>
>
>
>> While checking for shippability, we build the target list which is passed
>> to
>> the foreign server as fdw_scan_tlist. The target list contains
>> a. All the GROUP BY expressions
>> b. Shippable entries from the target list of upper relation
>> c. Var and Aggref nodes from non-shippable entries from the target list of
>> upper relation
>>
>
> The code in the patch doesn't seem to add Var nodes explicitly. It assumes
> that
> the Var nodes will be part of GROUP BY clause. The code is correct, I
> think.
>
>
>> d. Var and Aggref nodes from non-shippable HAVING conditions.
>>
>
> This needs to be changed as per the comments below.
>
>>
>> Costing:
>>
>> If use_foreign_estimate is true for input relation, like JOIN case, we use
>> EXPLAIN output to get the cost of query with aggregation/grouping on the
>> foreign server. If not we calculate the costs locally. Similar to core,
>> we use
>> get_agg_clause_costs() to get costs for aggregation and then using logic
>> similar to cost_agg() we calculate startup and total cost. Since we have
>> no
>> idea which aggregation strategy will be used at foreign side, we add all
>> startup cost (startup cost of input relation, aggregates etc.) into
>> startup
>> cost for the grouping path and similarly for total cost.
>>
>
> This looks OK to me.
>
>
>>
>> Deparsing the query:
>>
>> Target list created while checking for shippability is deparsed using
>> deparseExplicitTargetList(). sortgroupref are adjusted according to this
>> target list. Most of the logic to deparse an Aggref is inspired from
>> get_agg_expr(). For an upper relation, FROM and WHERE clauses come from
>> the
>> underlying scan relation and thus for simplicity, FROM clause deparsing
>> logic
>> is moved from deparseSelectSql() to a new function deparseFromClause().
>> The
>> same function adds WHERE clause to the remote SQL.
>>
>
> Ok.
>
>
>>
>>
>> Area of future work:
>>
>> 1. Adding path with path-keys to push ORDER BY clause along with
>> aggregation/
>> grouping. Should be supported as a separate patch.
>>
>> 2. Grouping Sets/Rollup/Cube is not supported in current version. I have
>> left
>> this aside to keep patch smaller. If required I can add that support in
>> the
>> next version of the patch.
>>
>
> I am fine with these limitations for first cut of this feature.
>
> I think we should try to measure performance gain because of aggregate
> pushdown. The EXPLAIN
> doesn't show actual improvement in the execution times.
>
> Here are the comments on the patch.
>
> Patch compiles without errors/warnings - Yes
> make check passes - Yes.
> make check in contrib/postgres_fdw passes - Yes
>
> Attached patch (based on your patch) has some typos corrected, some
> comments
> rephrased. It also has some code changes, as explained in various comments
> below. Please see if those look good.
>
> 1. Usually for any deparseSomething function, Something is the type of node
> produced by the parser when the string output by that function is parsed.
> deparseColumnRef, for example, produces a string, which when parsed
> produces a
> columnRef node. There is are nodes of type FromClause and AggOrderBy. I
> guess,
> you meant FromExpr instead of FromClause. deparseAggOrderBy() may be
> renamed as
> appendOrderByFromList() or something similar. It may be utilized for window
> functions if required.
>
> 2. Can we infer the value of foragg flag from the RelOptInfo passed to
> is_foreign_expr()? For any upper relation, it is ok to have aggregate in
> there. For any other relation aggregate is not expected.
>
> 3. In function foreign_grouping_ok(), should we use classifyConditions()?
> The
> function is written and used for base relations. There's nothing in that
> function, which prohibits it being used for other relations. I feel that
> foreign_join_ok() should have used the same function to classify the other
> clauses.
>
> 4. May be the individual criteria in the comment block
> + /*
> + * Aggregate is safe to pushdown if
> + * 1. It is a built-in aggregate
> + * 2. All its arguments are safe to push-down
> + * 3. The functions involved are immutable.
> + * 4. Other expressions involved like aggorder,
> aggdistinct are
> + * safe to be pushed down.
> + */
> should be associated with the code blocks which implement those. As the
> criteria change it's difficult to maintain the numbered list in sync with
> the
> code.
>
> 5. The comment
> + /* Aggregates other than simple one are non-pushable. */
> should read /* Only non-split aggregates are pushable. */ as
> AGGSPLIT_SIMPLE
> means a complete, non-split aggregation step.
>
> 6. An aggregate function has transition, combination and finalization
> function
> associated with it. Should we check whether all of the functions are
> shippable?
> But probably it suffices to check whether aggregate function as whole is
> shippable or not using is_shippable() since it's the whole aggregate we are
> interested in and not the intermediate results. Probably, we should add a
> comment explaining why it's sufficient to check the aggregate function as a
> whole. I wondered whether we should check shippability of aggregate return
> type, but we don't do that for functions as well. So it's fine.
>
> 7. It looks like aggdirectargs, aggdistinct, aggorder expressions are all
> present in args list. If all the expressions in args are shippable, it
> means
> those lists are also shippable and hence corresponding aggregate
> subclauses are
> shippable. Are we unnecessarily checking shippability of those other lists?
>
> 8. The prologue of build_tlist_to_deparse() mentions that the output
> targetlist
> contains columns, which is not true with your patch. The targetlist
> returned by
> this function can have expressions for upper relations. Please update the
> prologue to reflect this change.
>
> 9. In build_tlist_to_deparse(), we are including aggregates from
> unshippable
> conditions without checking whether they are shippable or not. This can
> cause
> an unshippable expression to be pushed to the foreign server as seen below
>
> explain verbose select avg(c1) from ft1 having avg(c1 * random()) > 100;
> QUERY PLAN
>
> ------------------------------------------------------------
> --------------------------
> Foreign Scan (cost=112.50..133.53 rows=1 width=32)
> Output: (avg(c1))
> Filter: ((avg(((ft1.c1)::double precision * random()))) > '100'::double
> precision)
> Relations: Aggregate on (public.ft1)
> Remote SQL: SELECT avg(r1."C 1"), avg((r1."C 1" * random())) FROM "S
> 1"."T 1" r1
> (5 rows)
>
> We should check shippability of aggregates in unshippable conditions in
> foreign_grouping_ok(). If any of those are not shippable, aggregation can
> not
> be pushed down. Otherwise, include those in the grouped_tlist.
>
> 10. Comment /* Construct FROM clause */ should read "Construct FROM and
> WHERE
> clauses." to be in sync with the next function call. deparseFromClause()
> should
> be renamed as deparseFromExpr() inline with the naming convention of
> deparse<something> functions. From the function signature of
> deparseFromClause(), it doesn't become clear as to what contributes to the
> FROM
> clause. Should we add a comment in the prologue of that function
> explaining the
> same. Also it strikes odd that for an upper relation, we pass remote_conds
> to
> this function, but it doesn't deparse those but deparses the remote
> conditions
> from the scan relation and later deparses the same remote_conds as HAVING
> clause. Although this is correct, the code looks odd. May be the codeblock
> in
> deparseFuncClause()
> 1008 /*
> 1009 * For aggregates the FROM clause will be build from underneath
> scan rel.
> 1010 * WHERE clause conditions too taken from there.
> 1011 */
> 1012 if (foreignrel->reloptkind == RELOPT_UPPER_REL)
> 1013 {
> 1014 PgFdwRelationInfo *ofpinfo;
> 1015
> 1016 ofpinfo = (PgFdwRelationInfo *) fpinfo->outerrel->fdw_private;
> 1017 scan_rel = fpinfo->outerrel;
> 1018 context->foreignrel = scan_rel;
> 1019 remote_conds = ofpinfo->remote_conds;
> 1020 }
> should be moved into deparseSelectStmtForRel() to make the things clear.
>
> 11. Whether to qualify a column name by an alias should be based on
> whether the
> underlying scan relation is a join or base relation scan. That can be
> achieved
> by setting scanrel in the deparse_context. Attached patch has this
> implemented.
> Instead of scanrel, we may also have use_alias value as part of the
> context,
> which is used everywhere.
>
> 12. The code to print the function name in deparseAggref() seems similar to
> that in deparseFuncExpr(). Should we extract it out into a separate
> function
> and call that function in deparseAggref() and deparseFuncExpr() both?
>
> 13. While deparsing the direct aggregates, the code is not adding ","
> between
> two arguments, resulting in error like below.
> select rank(c1, c2) within group (order by c1, c2) from ft1 group by c1,
> c2;
> ERROR: syntax error at or near "c2"
> CONTEXT: Remote SQL command: SELECT rank("C 1"c2) WITHIN GROUP (ORDER BY
> "C
> 1", c2), "C 1", c2 FROM "S 1"."T 1" GROUP BY "C 1", c2
> May be we should add support to deparse List expressions by separating list
> items by "," similar to get_rule_expr() and use that here.
>
> 14. In deparseAggOrderBy() we do not output anything for default cases
> like ASC
> or NULLS FIRST (for DESC). But like appendOrderByClause() it's better to be
> explicit and output even the default specifications. That way, we do not
> leave
> anything to the interpretation of the foreign server.
>
> 15. deparseAggOrderBy supports deparsing USING subclause for ORDER BY for
> an
> aggregate, but there is no corresponding shippability check for ordering
> operator. Also, we should try to produce a testcase for the same.
>
> 16. The code to deparse a sort/group reference in deparseAggOrderBy() and
> appendGroupByClause() looks similar. Should we extract it out into a
> function
> by itself and call that function in those two places similar to
> get_rule_sortgroupclause()?
>
> 17. In postgresGetForeignPlan() the code to extract conditions and
> targetlist
> is duplicated for join and upper relation. Attached patch removes this
> duplication. Since DML, FOR SHARE/UPDATE is not allowed with aggregation
> and
> grouping, we should get an outer plan in that code.
>
> 18. estimate_path_cost_size() has grown quite long (~400 lines). Half of
> that
> function deals with estimating costs locally. Should we split this function
> into smaller functions, one for estimating size and costs of each kind of
> relations locally?
>
> 19. Condition below
> + if (sgref && query->groupClause && query->groupingSets == NIL &&
> + get_sortgroupref_clause_noerr(sgref, query->groupClause) !=
> NULL)
> can be rewritten as
> if (sgref && get_sortgroupref_clause_noerr(sgref, query->groupClause))
> since we won't come here with non-NULL query->groupingSets and
> get_sortgroupref_clause_noerr() will return NULL, if there aren't any
> groupClause.
>
> 20. Please use word "foreign" instead of "remote" in comments.
>
> 21. This code looks dubious
> + /* If not GROUP BY ref, reset it as we are not pushing those
> */
> + if (sgref)
> + grouping_target->sortgrouprefs[i] = 0;
> grouping_target comes from RelOptInfo, which is being modified here. We
> shouldn't be modifying anything in the RelOptInfo while checking whether
> the
> aggregate/grouping is shippable. You may want to copy the pathtaget and
> modify
> the copy.
>
> 22. Following comment needs more elaboration.
> + /*
> + * Add aggregates, if any, into tlist. Plain Var nodes
> pulled
> + * are already part of GROUP BY and thus no need to add
> them
> + * explicitly.
> + */
> Plain Var nodes will either be same as some GROUP BY expression or should
> be
> part of some GROUP BY expression. In the later case, the query can not
> refer
> those without the surrounding expression. which will be part of the
> targetlist
> list. Hence we don't need to add plain Var nodes in the targetlist. In fact
> adding those in SELECT clause will cause error on the foreign server if
> they
> are not part of GROUP BY clause.
>
> 23. Probably you don't need to save this in fpinfo. You may want to
> calculate
> it whenever it's needed.
> + fpinfo->grouped_exprs = get_sortgrouplist_exprs(query->groupClause,
> tlist);
>
> 24. Can this ever happen for postgres_fdw? The core sets fdwroutine and
> calls
> this function when the underlying scan relation has fdwroutine set, which
> implies that it called corresponding GetForeignPath hooks, which should
> have
> set fdw_private. Nonetheless, I think, still the condition is useful to
> avoid
> crashing the server in case the fdw_private is not set. But we need better
> comments.
> + /* If input rel is not aware of fdw, simply return */
> + if (!input_rel->fdw_private)
> + return;
>
> 25. Although it's desirable to get a switch case in
> postgresGetForeignUpperPaths() eventually, for this patch I guess we should
> have an if condition there.
>
> 26. Attached patch has restructured code in appendGroupByClause() to reduce
> indentation and make the code readable. Let me know if this looks good to
> you.
>
> 27. Prologue of create_foreign_grouping_paths() does not have formatting
> similar to the surrounding functions. Please fix it. Since the function
> "create"s and "add"s paths, it might be better to rename it as
> add_foreign_grouping_paths().
>
> 28. Isn't the following true for any upper relation and shouldn't it be
> part of
> postgresGetForeignUpperPaths(), rather than create_foreign_grouping_paths(
> )?
> + /*
> + * If input rel is not safe to pushdown, we cannot do grouping and/or
> + * aggregation on it.
> + */
> + if (!ifpinfo || !ifpinfo->pushdown_safe)
> + return;
>
> 29. We require RelOptInfo::relids since create_foreignscan_plan() copies
> them
> into ForeignPlan::fs_relids and executor uses them. So, I guess, we have
> to set
> those in the core somewhere. May be while calling fetch_upper_rel() in
> grouping_planner(), we should call it with relids of the underlying scan
> relation. I don't see any function calling fetch_upper_rel() with non-NULL
> relids. In fact, if we are to create an upper relation with relids in
> future,
> this code is going to wipe that out, which will be undesirable. Also,
> generally, when copying relids, it's better to use bms_copy() to make a
> copy
> of Bitmapset, instead of just assigning the pointer.
>
> 30. By the time postgresGetForeignUpperPaths() gets called, the core has
> already added its own paths, so it doesn't make much sense to set rows and
> width grouped_rel in create_foreign_grouping_paths().
>
> 31. fpinfo->server and user fields are being set twice, once in
> create_foreign_grouping_paths() then in foreign_grouping_ok(). Do we need
> that?
>
> 32.
> + /*
> + * Do we want to create paths with pathkeys corresponding for
> + * root->query_pathkeys.
> + */
> Yes, it would be desirable to do that. If we are not going to do that in
> this
> patch, may be remove that line or add a TODO kind of comment.
>
> 33. The patch marks build_path_tlist() as non-static but does not use it
> anywhere outside creatplan.c. Is this change intentional?
>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

--

*Thanks & Regards,*

*Prabhat Kumar Sahu*

Mob: 7758988455

Skype ID: prabhat.sahu1984

www.enterprisedb.co <http://www.enterprisedb.com/>m
<http://www.enterprisedb.com/>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2016-09-12 07:07:33 Re:
Previous Message Craig Ringer 2016-09-12 06:46:26 Re: patch: function xmltable