Re: Aggregate Push Down - Performing aggregation on foreign server

From: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Aggregate Push Down - Performing aggregation on foreign server
Date: 2016-09-30 11:53:09
Message-ID: CAM2+6=U2Y1j9SpJzh2br63J=YPMQLHk6GNfivJ402uwdGVRBRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Ashutosh,

On Mon, Sep 26, 2016 at 2:28 PM, Ashutosh Bapat <
ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:

> Thanks Jeevan for working on the comments.
>
> Ok. Yes, we should also handle bare conditions in
> classifyConditions(). I am fine doing it as a separate patch.
>

Doing that with separate patch would be good.

>
> I don't think add_foreign_grouping_path() is the right place to change
> a structure managed by the core and esp when we are half-way adding
> paths. An FDW should not meddle with an upper RelOptInfo. Since
> create_foreignscan_plan() picks those up from RelOptInfo and both of
> those are part of core, we need a place in core to set the
> RelOptInfo::relids for an upper relation OR we have stop using
> fs_relids for upper relation foreign scans.
>

Yes, agree that relids must be set by the core rather than a fdw.
However I don't want to touch the core for this patch and also not
exactly sure where we can do that. I think, for this patch, we can
copy relids into grouped_rel in create_grouping_paths() at place
where we assign fdwroutine, userid etc. from the input relation.

>
> Here are some more comments on the patch, mostly focused on the tests.
> 1. If we dig down each usage of deparse_expr_cxt::scanrel, it boils down
> checking whether a given Var comes from given scan relation (deparseVar())
> or
> checking whether a given Var needs relation qualification while deparsing
> (again deparseVar()). I think both of those purposes can be served by
> looking
> at scanrel::relids, multiple relids implying relation qualification. So,
> instead of having a separate member scan_rel, we should instead fetch the
> relids from deparse_expr_cxt::foreignrel. I am assuming that the proposed
> change to set relids in upper relations is acceptable. If it's not, we
> should
> pass scanrel->relids through the context instead of scanrel itself.
>

Yep. Removed scanrel altogether and used relids whenever required.

>
> 2. SortGroupClause is a parser node, so we can name
> appendSortGroupClause() as
> deparseSortGroupClause(), to be consistent with the naming convention. If
> we do
> that change, may be it's better to pass SortGroupClause as is and handle
> other
> members of that structure as well.
>

Renamed appendSortGroupClause() to deparseSortGroupClause().
I have kept this function in sync with get_rule_sortgroupclause() which
takes the Index ref from SortGroupClause(). This function require just
an index and thus passing SortGroupClause as whole is unnecessary. However
we cannot pass entire order by list or group by list, because in case of
order by list we need some extra processing on list elements. So passing
just Index is sufficient and in sync with get_rule_sortgroupclause() too.

> 4. Following line is correct as long as there is only one upper relation.
> + context.scanrel = (rel->reloptkind == RELOPT_UPPER_REL) ?
> fpinfo->outerrel : rel;
> But as we push down more and more of upper operation, there will be a
> chain of
> upper relations on top of scan relation. So, it's better to assert that the
> scanrel is a join or a base relation to be future proof.
>

After making changes reported in (1), this line has removed.
For future proof, as discussed offline, added Assert() in deparseFromExpr().

> 5. In deparseConst(), showtype is compared with hardcoded values. The
> callers of this function pass hardcoded values. Instead, we should
> define an enum and use it. I understand that this logic has been borrowed
> from
> get_const_expr(), which also uses hardcoded values. Any reason why not to
> adopt
> a better style here? In fact, it only uses two states for showtype, 0 and
> ">
> 0". Why don't we use a boolean then OR why isn't the third state in
> get_const_expr() applicable here?
>

We certainly can add an enum here, but for consistency with other related
code I think using hard-coded value is good for now. Also I see this
comment in prologue of deparseConst()
* This function has to be kept in sync with ruleutils.c's get_const_expr.

So better to have handling like it.
Also, currently we use only two values for showtype. But for consistency
let use int instead of bool. In future if we add support for coercion
expr, then we need this third value. At that time we will not need changes
here.
However if required, we can submit a separate patch for adding enum
instead of int for showtype in ruleutils.c.

>
> 7. The changes in block ...
> should be in sync. The later block adds both aggregates and Vars but the
> first
> one doesn't. Why is this difference?
>

add_to_flat_tlist() is taking care of duplicate entries and thus in second
block, I was calling it after the loop to avoid calling it for every list
element. Well, moved that inside loop like first block.

>
> 9. EXPLAIN of foreign scan is adding paranthesis around output expressions,
> which is not needed. EXPLAIN output of other plan nodes like Aggregate
> doesn't
> do that. If it's easy to avoid paranthesis, we should do it in this patch.
> With
> this patch, we have started pushing down expressions in the target list, so
> makes sense to fix it in this patch.
>

ExecInitForeignScan() while determining scan tuple type from passed
fdw_scan_tlist, has target list containing Vars with varno set to
INDEX_VAR. Due to which while deparsing we go through
resolve_special_varno() and get_special_variable() function which
forces parenthesis around the expression.
I can't think of any easy and quick solution here. So keeping this
as is. Input will be welcome or this can be done separately later.

>
> 10. While deparsing the GROUP BY clauses, the code fetches the expressions
> and
> deparses the expressions. We might save some CPU cycles if we deparse the
> clauses by their positional indexes in SELECT clause instead of the actual
> expressions. This might make the query slightly unreadable. What do you
> think?
>

To me it is better to have expression in the GROUP BY clause as it is
more readable and easy to understand. Also it is in sync with deparsing
logic in ruleutils.c.

>
> Comments about tests:
>
> 4. May be bulk of these testcases need to be moved next to Join testcases.
> That way their outputs do not change in case the DMLs change in future. In
> case
> you do that, adjust capitalization accordingly.
>

Moved testcases after Join testcases.
However I have not made any capitalization changes. I see many tests
like those elsewhere in the file too. Let me know if that's the policy
to have appropriate capitalization in test-case. I don't want violate
the policy if any and will prefer to do the changes as per the policy.

>
> 13. Wouldn't having random() in the testcase, make them produce different
> output in different runs? Also, please see if we can reduce the output
> rows to
> a handful few. Following query produces 21 rows. Is it really important to
> have
> all those rows in the output?
> +select c2, sum(c1) from ft2 group by c2 having sum(c1) * random() <
> 500000 order by c2;
>

The condition is written in such a way that we will get all rows,
nullifying the random() effect, so I don't think there will be any
issue with the multiple runs.

>
> 16. In the following query, the subquery returns only a single row, so
> DISTINCT
> doesn't have any effects. Please change the subquery to return multiple
> rows
> with some duplicates so that DISTINCT will be propertly tested.
> +select distinct (select count(*) filter (where t2.c2 = 6 and t2.c1 <
> 10) from ft1 t1 where t1.c1 = 6) from ft1 t2 order by 1;
> What's the difference between this query and the next one. The comment
> mentions
> INNER and OUTER queries, but I don't see that difference. Am I missing
> something?
>

In case of INNER aggregate query, we get multiple rows and thus used
DISTINCT to have small number of output. However in case of OUTER
aggregate query, we get single row. But to keep that query much
identical with INNER query, I have used DISTINCT. It becomes easy to
compare and follow too.
Please see the EXPLAIN output for observing difference between INNER
and OUTER aggregate query. When inner query is aggregate query you
should see that aggregation is performing on t1 and when outer query
is aggregate query, you will see t2.

>
> 20. Why do we need the last statement ALTERing server extensions? Is it not
> already done when the function was added to the extension?
> +alter extension postgres_fdw drop function least_accum(anyelement,
> variadic anyarray);
> +alter extension postgres_fdw drop aggregate least_agg(variadic items
> anyarray);
> +alter server loopback options (set extensions 'postgres_fdw');
>

Whenever we alter extension, we must refresh the server and thus required
last statement ALTERing server. Without that elements added into extension
does not reflect in the server.

> 23. This comment talks about deeper code level internal details. Should
> have a
> better user-visible explanations.
> +-- ORDER BY expression is not present as is in target list of remote
> query. This needed clearing out sortgrouprefs flag.
>

Not sure how can I put those comments without referring some code level
internal details. I have tried altering those comments but it will be
good if you too try rephrasing it.

> postgres_fdw.out has grown by 2000 lines because of testcases in this
> patch. We need to compact the testcases so that easier to maintain in
> the future.
>

I have removed many duplicate tests and also combined many of them.
Also removed tests involving local tables. Testcase changes now
become 1/3 of earlier one.

In the attached patch I have fixed all other review comments you have
posted. All the comments were excellent and helped me a lot to improve
in various areas.

Thanks

--
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Attachment Content-Type Size
pg_agg_push_down_v3.patch text/x-patch 122.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeevan Chalke 2016-09-30 11:58:43 Re: Aggregate Push Down - Performing aggregation on foreign server
Previous Message Peter Geoghegan 2016-09-30 11:47:50 Re: Hash Indexes