Re: WIP: Aggregation push-down

From: Antonin Houska <ah(at)cybertec(dot)at>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: Aggregation push-down
Date: 2017-09-08 12:36:03
Message-ID: 22851.1504874163@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:

> 1.
> + if (aggref->aggvariadic ||
> + aggref->aggdirectargs || aggref->aggorder ||
> + aggref->aggdistinct || aggref->aggfilter)
>
> I did not understand, why you are not pushing aggregate in above cases?
> Can you please explain?

Currently I'm trying to implement infrastructure to propagate result of
partial aggregation from base relation or join to the root of the join tree
and to use these as input for the final aggregation. Some special cases are
disabled because I haven't thought about them enough so far. Some of these
restrictions may be easy to relax, others may be hard. I'll get back to them
at some point.

> 2. "make check" in contrib/postgres_fdw crashes.
>
> SELECT COUNT(*) FROM ft1 t1;
> ! server closed the connection unexpectedly
> ! This probably means the server terminated abnormally
> ! before or while processing the request.
> ! connection to server was lost
>
> From your given setup, if I wrote a query like:
> EXPLAIN VERBOSE SELECT count(*) FROM orders_0;
> it crashes.
>
> Seems like missing few checks.

Please consider this a temporary limitation.

> 3. In case of pushing partial aggregation on the remote side, you use schema
> named "partial", I did not get that change. If I have AVG() aggregate,
> then I end up getting an error saying
> "ERROR: schema "partial" does not exist".

Attached to his email is an extension that I hacked quickly at some point, for
experimental purposes only. It's pretty raw but may be useful if you just want
to play with it.

> 4. I am not sure about the code changes related to pushing partial
> aggregate on the remote side. Basically, you are pushing a whole aggregate
> on the remote side and result of that is treated as partial on the
> basis of aggtype = transtype. But I am not sure that is correct unless
> I miss something here. The type may be same but the result may not.

You're right. I mostly used sum() and count() in my examples but these are
special cases. It should also be checked whether aggfinalfn exists or
not. I'll fix it, thanks.

> 5. There are lot many TODO comments in the patch-set, are you working
> on those?

Yes. I wasn't too eager to complete all the TODOs soon because reviews of the
partch may result in a major rework. And if large parts of the code will have
to be wasted, some / many TODOs will be gone as well.

> Thanks
>
> On Thu, Aug 17, 2017 at 8:52 PM, Antonin Houska <ah(at)cybertec(dot)at> wrote:
>
> Antonin Houska <ah(at)cybertec(dot)at> wrote:
>
> > Antonin Houska <ah(at)cybertec(dot)at> wrote:
> >
> > > This is a new version of the patch I presented in [1].
> >
> > Rebased.
> >
> > cat .git/refs/heads/master
> > b9a3ef55b253d885081c2d0e9dc45802cab71c7b
>
> This is another version of the patch.
>
> Besides other changes, it enables the aggregation push-down for postgres_fdw,
> although only for aggregates whose transient aggregation state is equal to the
> output type. For other aggregates (like avg()) the remote nodes will have to
> return the transient state value in an appropriate form (maybe bytea type),
> which does not depend on PG version.
>
> shard.tgz demonstrates the typical postgres_fdw use case. One query shows base
> scans of base relation's partitions being pushed to shard nodes, the other
> pushes down a join and performs aggregation of the join result on the remote
> node. Of course, the query can only references one particular partition, until
> the "partition-wise join" [1] patch gets committed and merged with this my
> patch.
>
> One thing I'm not sure about is whether the patch should remove
> GetForeignUpperPaths function from FdwRoutine, which it essentially makes
> obsolete. Or should it only be deprecated so far? I understand that
> deprecation is important for "ordinary SQL users", but FdwRoutine is an
> interface for extension developers, so the policy might be different.
>
> [1] https://commitfest.postgresql.org/14/994/
>
> Any feedback is appreciated.
>
> --
> Antonin Houska
> Cybertec Schönig & Schönig GmbH
> Gröhrmühlgasse 26
> A-2700 Wiener Neustadt
> Web: http://www.postgresql-support.de, http://www.cybertec.at
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
> --
> Jeevan Chalke
> Principal Software Engineer, Product Development
> EnterpriseDB Corporation
> The Enterprise PostgreSQL Company
>
>

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at

Attachment Content-Type Size
partial.tgz application/x-gzip 3.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message amul sul 2017-09-08 12:40:25 Re: [POC] hash partitioning
Previous Message Thomas Munro 2017-09-08 12:23:35 Re: Red-Black tree traversal tests