Re: Partition-wise aggregation/grouping

From: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
To: Antonin Houska <ah(at)cybertec(dot)at>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise aggregation/grouping
Date: 2017-03-23 11:10:52
Message-ID: CAM2+6=UdiA0CTeX16OD3AsX=dMkxijOKvymwLL9pp20zoYARUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 21, 2017 at 1:47 PM, Antonin Houska <ah(at)cybertec(dot)at> wrote:

> Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
>
> > Declarative partitioning is supported in PostgreSQL 10 and work is
> already in
> > progress to support partition-wise joins. Here is a proposal for
> partition-wise
> > aggregation/grouping. Our initial performance measurement has shown 7
> times
> > performance when partitions are on foreign servers and approximately 15%
> when
> > partitions are local.
> >
> > Partition-wise aggregation/grouping computes aggregates for each
> partition
> > separately. If the group clause contains the partition key, all the rows
> > belonging to a given group come from one partition, thus allowing
> aggregates
> > to be computed completely for each partition. Otherwise, partial
> aggregates
> > computed for each partition are combined across the partitions to
> produce the
> > final aggregates. This technique improves performance because:
>
> > i. When partitions are located on foreign server, we can push down the
> > aggregate to the foreign server.
>
> > ii. If hash table for each partition fits in memory, but that for the
> whole
> > relation does not, each partition-wise aggregate can use an in-memory
> hash
> > table.
>
> > iii. Aggregation at the level of partitions can exploit properties of
> > partitions like indexes, their storage etc.
>
> I suspect this overlaps with
>
> https://www.postgresql.org/message-id/29111.1483984605%40localhost
>
> I'm working on the next version of the patch, which will be able to
> aggregate
> the result of both base relation scans and joins. I'm trying hard to make
> the
> next version available before an urgent vacation that I'll have to take at
> random date between today and early April. I suggest that we coordinate the
> effort, it's lot of work in any case.
>

IIUC, it seems that you are trying to push down the aggregation into the
joining relations. So basically you are converting
Agg -> Join -> {scan1, scan2} into
FinalAgg -> Join -> {PartialAgg -> scan1, PartialAgg -> scan2}.
In addition to that your patch pushes aggregates on base rel to its
children,
if any.

Where as what I propose here is pushing down aggregation below the append
node keeping join/scan as is. So basically I am converting
Agg -> Append-> Join -> {scan1, scan2} into
Append -> Agg -> Join -> {scan1, scan2}.
This will require partition-wise join as posted in [1].
But I am planning to make this work for partitioned relations and not for
generic inheritance.

I treat these two as separate strategies/paths to be consider while
planning.

Our work will overlap when we are pushing down the aggregate on partitioned
base relation to its children/partitions.

I think you should continue working on pushing down aggregate onto the
joins/scans where as I will continue my work on pushing down aggregates to
partitions (joins as well as single table). Once we are done with these
task,
then we may need to find a way to integrate them.

[1]
https://www.postgresql.org/message-id/flat/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA(at)mail(dot)gmail(dot)com#CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@mail.gmail.com

>
> --
> Antonin Houska
> Cybertec Schönig & Schönig GmbH
> Gröhrmühlgasse 26
> A-2700 Wiener Neustadt
> Web: http://www.postgresql-support.de, http://www.cybertec.at
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

--
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kuntal Ghosh 2017-03-23 11:19:23 Re: exposing wait events for non-backends (was: Tracking wait event for latches)
Previous Message Rafia Sabih 2017-03-23 10:59:37 Re: pgbench - allow to store select results into variables