Re: WIP: Aggregation push-down

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Antonin Houska <ah(at)cybertec(dot)at>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Aggregation push-down
Date: 2017-09-07 11:15:52
Message-ID: CAFjFpRdpeMTd8kYbM_x0769V-aEKst5Nkg3+coG=8ki7s8Zqjw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 17, 2017 at 8:52 PM, Antonin Houska <ah(at)cybertec(dot)at> wrote:
> Antonin Houska <ah(at)cybertec(dot)at> wrote:
>
>> Antonin Houska <ah(at)cybertec(dot)at> wrote:
>>
>> > This is a new version of the patch I presented in [1].
>>
>> Rebased.
>>
>> cat .git/refs/heads/master
>> b9a3ef55b253d885081c2d0e9dc45802cab71c7b
>
> This is another version of the patch.
>
> Besides other changes, it enables the aggregation push-down for postgres_fdw,
> although only for aggregates whose transient aggregation state is equal to the
> output type. For other aggregates (like avg()) the remote nodes will have to
> return the transient state value in an appropriate form (maybe bytea type),
> which does not depend on PG version.

Having transient aggregation state type same as output type doesn't
mean that we can feed the output of remote aggregation as is to the
finalization stage locally or finalization is not needed at all. I am
not sure why is that test being used to decide whether or not to push
an aggregate (I assume they are partial aggregates) to the foreign
server. postgres_fdw doesn't have a feature to fetch partial aggregate
results from the foreign server. Till we do that, I don't think this
patch can push any partial aggregates down to the foreign server.

>
> shard.tgz demonstrates the typical postgres_fdw use case. One query shows base
> scans of base relation's partitions being pushed to shard nodes, the other
> pushes down a join and performs aggregation of the join result on the remote
> node. Of course, the query can only references one particular partition, until
> the "partition-wise join" [1] patch gets committed and merged with this my
> patch.

Right. Until then joins will not have children.

>
> One thing I'm not sure about is whether the patch should remove
> GetForeignUpperPaths function from FdwRoutine, which it essentially makes
> obsolete. Or should it only be deprecated so far? I understand that
> deprecation is important for "ordinary SQL users", but FdwRoutine is an
> interface for extension developers, so the policy might be different.

I doubt if that's correct. We can do that only when we know that all
kinds aggregates/grouping can be pushed down the join tree, which
looks impossible to me. Apart from that that FdwRoutine handles all
kinds of upper relations like windowing, distinct, ordered which are
not pushed down by this set of patches.

Here's review of the patchset
The patches compile cleanly when applied together.

Testcase "limit" fails in make check.

This is a pretty large patchset and the patches are divided by stages in the
planner rather than functionality. IOW, no single patch provides a full
functionality. Reviewing and probably committing such patches is not easy and
may take quite long. Instead, if the patches are divided by functionality, i.e.
each patch provides some functionality completely, it will be easy to review
and commit such patches with each commit giving something useful. I had
suggested breaking patches on that line at [1]. The patches also refactor
existing code. It's better to move such refactoring in a patch/es by itself.

The patches need more comments, explaining various decisions e.g.
if (joinrel)
{
/* Create GatherPaths for any useful partial paths for rel */
- generate_gather_paths(root, joinrel);
+ generate_gather_paths(root, joinrel, false);

/* Find and save the cheapest paths for this joinrel */
set_cheapest(joinrel);
For base relations, the patch calls generate_gather_paths() with
grouped as true and
false, but for join relations it doesn't do that. There is no comment
explaining why.
OR
in the following code
+ /*
+ * Do partial aggregation at base relation level if the relation is
+ * eligible for it.
+ */
+ if (rel->gpi != NULL)
+ create_grouped_path(root, rel, path, false, true, AGG_HASHED);
Why not to consider AGG_SORTED paths?

The patch maintains an extra rel target, two pathlists, partial pathlist and
non-partial pathlist for grouping in RelOptInfo. These two extra
pathlists require "grouped" argument to be passed to a number of functions.
Instead probably it makes sense to create a RelOptInfo for aggregation/grouping
and set pathlist and partial_pathlist in that RelOptInfo. That will also make a
place for keeping grouped estimates.

For placeholders we have two function add_placeholders_to_base_rels() and
add_placeholders_to_joinrel(). We have add_grouping_info_to_base_rels(), but do
not have corresponding function for joinrels. How do we push aggregates
involving two or more relations to the corresponding joinrels?

Some minor assorted comments
Avoid useless hunk.
@@ -279,8 +301,8 @@ add_paths_to_joinrel(PlannerInfo *root,
* joins, because there may be no other alternative.
*/
if (enable_hashjoin || jointype == JOIN_FULL)
- hash_inner_and_outer(root, joinrel, outerrel, innerrel,
- jointype, &extra);
+ hash_inner_and_outer(root, joinrel, outerrel, innerrel, jointype,
+ &extra);

In the last "regression" patch, there are some code changes (mostly because of
pg_indent run). May be you want to include those in appropriate code patches.

Some quick observations using two tables t1(a int, b int) and t2(a int, b int),
populated as
insert into t1 select i, i % 5 from generate_series(1, 100) i where i % 2 = 0;
insert into t2 select i, i % 5 from generate_series(1, 100) i where i % 3 = 0;

1. The patch pushes aggregates down join in the following query
explain verbose select avg(t2.a) from t1 inner join t2 on t1.b = t2.b group by
t2.b;
but does not push the aggregates down join in the following query
explain verbose select avg(t2.a) from t1 left join t2 on t1.b = t2.b group by
t2.b;
In fact, it doesn't use the optimization for any OUTER join. I think the reason
for this behaviour is that the patch uses equivalence classes to distribute the
aggregates and grouping to base relations and OUTER equi-joins do not form
equivalence classes. But I think it should be possible to push the aggregates
down the OUTER join by adding one row for NULL values if the grouping is pushed
to the inner side. I don't see much change for outer side. This also means that
we have to choose some means other than equivalence class for propagating the
aggregates.

2. Following query throws error with these patches, but not without the
patches.
explain verbose select sum(t1.a + t2.a) from t1, t2, t2 t3 where t1.a
= t2.a
group by t2.a, t1.a;
ERROR: ORDER/GROUP BY expression not found in targetlist

[1] https://www.postgresql.org/message-id/CAFjFpRejPP4K%3Dg%2B0aaq_US0YrMaZzyM%2BNUCi%3DJgwaxhMUj2Zcg%40mail.gmail.com

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2017-09-07 11:16:46 Re: path toward faster partition pruning
Previous Message Antonin Houska 2017-09-07 11:02:55 Re: Partition-wise join for join between (declaratively) partitioned tables