Re: Partition-wise aggregation/grouping

From: Antonin Houska <ah(at)cybertec(dot)at>
To: jeevan(dot)chalke(at)enterprisedb(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Partition-wise aggregation/grouping
Date: 2017-04-26 10:28:02
Message-ID: 4127.1493202482@localhost
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Antonin Houska <ah(at)cybertec(dot)at> wrote:

> Antonin Houska <ah(at)cybertec(dot)at> wrote:
> >
> > Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
> >
> > > Our work will overlap when we are pushing down the aggregate on partitioned
> > > base relation to its children/partitions.
> > >
> > > I think you should continue working on pushing down aggregate onto the
> > > joins/scans where as I will continue my work on pushing down aggregates to
> > > partitions (joins as well as single table). Once we are done with these task,
> > > then we may need to find a way to integrate them.
> > >
> > > [1] https://www.postgresql.org/message-id/flat/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA(at)mail(dot)gmail(dot)com#CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@mail.gmail.com
> >
> > My patch does also create (partial) aggregation paths below the Append node,
> > but only expects SeqScan as input. Please check if you patch can be based on
> > this or if there's any conflict.
>
> Well, I haven't imposed any explicit restriction on the kind of path to be
> aggregated below the Append path. Maybe the only thing to do is to merge my
> patch with the "partition-wise join" patch (which I haven't checked yet).

Attached is a diff that contains both patches merged. This is just to prove my
assumption, details to be elaborated later. The scripts attached produce the
following plan in my environment:

QUERY PLAN
------------------------------------------------
Parallel Finalize HashAggregate
Group Key: b_1.j
-> Append
-> Parallel Partial HashAggregate
Group Key: b_1.j
-> Hash Join
Hash Cond: (b_1.j = c_1.k)
-> Seq Scan on b_1
-> Hash
-> Seq Scan on c_1
-> Parallel Partial HashAggregate
Group Key: b_2.j
-> Hash Join
Hash Cond: (b_2.j = c_2.k)
-> Seq Scan on b_2
-> Hash
-> Seq Scan on c_2

Note that I had no better idea how to enforce the plan than hard-wiring zero
costs of the partial aggregation paths. This simulates the use case of partial
aggregation performed on remote node (postgres_fdw). Other use cases may
exist, but I only wanted to prove the concept in terms of coding so far.

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at

Attachment Content-Type Size
agg_pushdown_partition_wise.diff text/x-diff 387.2 KB
test_setup_partition_wise.sql text/plain 568 bytes
query_partition_wise.sql text/plain 190 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2017-04-26 10:30:25 Re: Cached plans and statement generalization
Previous Message Heikki Linnakangas 2017-04-26 10:22:36 Re: scram and \password