Re: Partition-wise aggregation/grouping

From: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise aggregation/grouping
Date: 2017-09-08 12:17:56
Message-ID: CAM2+6=UakP9+TSJuh2fbhHWNJc7OYFL1_gvu7mt2fXtVt6GY3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Wed, Aug 23, 2017 at 4:43 PM, Jeevan Chalke <
jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:

> Hi,
>
> Attached is the patch to implement partition-wise aggregation/grouping.
>
> As explained earlier, we produce a full aggregation for each partition when
> partition keys are leading group by clauses and then append is performed.
> Else we do a partial aggregation on each partition, append them and then
> add
> finalization step over it.
>
> I have observed that cost estimated for partition-wise aggregation and cost
> for the plans without partition-wise aggregation is almost same. However,
> execution time shows significant improvement (as explained my in the very
> first email) with partition-wise aggregates. Planner chooses a plan
> according
> to the costs, and thus most of the time plan without partition-wise
> aggregation is chosen. Hence, to force partition-wise plans and for the
> regression runs, I have added a GUC named partition_wise_agg_cost_factor to
> adjust the costings.
>
> This feature is only used when enable_partition_wise_agg GUC is set to on.
>
> Here are the details of the patches in the patch-set:
>

Here are the new patch-set re-based on HEAD (f0a0c17) and
latest partition-wise join (v29) patches.

>
> 0001 - Refactors sort and hash final grouping paths into separate
> functions.
> Since partition-wise aggregation too builds paths same as that of
> create_grouping_paths(), separated path creation for sort and hash agg into
> separate functions. These functions later used by main partition-wise
> aggregation/grouping patch.
>
> 0002 - Passes targetlist to get_number_of_groups().
> We need to estimate groups for individual child relations and thus need to
> pass targetlist corresponding to the child rel.
>
> 0003 - Adds enable_partition_wise_agg and partition_wise_agg_cost_factor
> GUCs.
>
> 0004 - Implements partition-wise aggregation.
>
> 0005 - Adds test-cases.
>
> 0006 - postgres_fdw changes which enable pushing aggregation for other
> upper
> relations.
>

0007 - Provides infrastructure to allow partial aggregation
This will allow us to push the partial aggregation over fdw.
With this one can write SUM(PARTIAL x) to get a partial sum
result. Since PARTIAL is used in syntax, I need to move that
to a reserved keywords category. This is kind of PoC patch
and needs input over approach and the way it is implemented.

0008 - Teaches postgres_fdw to push partial aggregation
With this we can push aggregate on remote server when
GROUP BY key does not match with the PARTITION key too.

>
>
> Since this patch is highly dependent on partition-wise join [1], one needs
> to
> apply all those patches on HEAD (my repository head was at:
> 66ed3829df959adb47f71d7c903ac59f0670f3e1) before applying these patches in
> order.
>
> Suggestions / feedback / inputs ?
>
> [1] https://www.postgresql.org/message-id/CAFjFpRd9Vqh_=-Ldv-
> XqWY006d07TJ+VXuhXCbdj=P1jukYBrw(at)mail(dot)gmail(dot)com
>
>
>
--
Jeevan Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Attachment Content-Type Size
partition-wise-agg-v2.tar.gz application/x-gzip 32.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2017-09-08 12:23:35 Re: Red-Black tree traversal tests
Previous Message amul sul 2017-09-08 11:21:40 Re: UPDATE of partition key