Re: Partial aggregates pushdown

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: "Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp" <Fujii(dot)Yuki(at)df(dot)mitsubishielectric(dot)co(dot)jp>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Ilya Gladyshev <i(dot)gladyshev(at)postgrespro(dot)ru>
Subject: Re: Partial aggregates pushdown
Date: 2023-06-06 00:10:18
Message-ID: ZH55akxE21R3TNVh@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 2, 2023 at 03:54:06AM +0000, Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp wrote:
> Hi Mr.Bruce, hackers.
>
> I updated the patch.
> The following is a list of comments received on the previous version of the patch
> and my update to them in this version of the patch.

This thread started in October 2021 so I would like to explain what this
feature adds.

Basically for partitions made up of postgres_fdw tables, there are four
possible optimizations:

1. Pruning, 3 stages, see slide 30 here:

https://momjian.us/main/writings/pgsql/partitioning.pdf#page=30

2. Parallelism across partitions, see slide 38 here:

https://momjian.us/main/writings/pgsql/beyond.pdf#page=38

3. Pushdown of partition-wise joins and aggregates, see slide 43 here:

https://momjian.us/main/writings/pgsql/partitioning.pdf#page=43

4. Pushdown of aggregates that aren't partition-wise

As far as I know, over the years we have accomplished all of these
items, except for #4. #3 involves aggregates where the GROUP BY or
JOINed tables match the partition keys.

Number 4 involves things like a SUM our COUNT that does not match the
partition key, or has no groupings at all.

#3 is easier than #4 since we just need to pass _rows_ back from the
foreign servers. #4 is more complex because _partial_ count/sum, or
even average values must be passed from the foreign servers to the
requesting server.

The good news is that we already have partial aggregate support as part
of our parallel aggregate feature, see:

https://momjian.us/main/writings/pgsql/beyond.pdf#page=38

What the patch does is to expand the existing partial aggregate code to
allow partial aggregate results to pass from the foreign servers to the
requesting server. This feature will be very useful for data warehouse
queries that need to compute aggregate across partitions.

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com

Only you can decide what is important to you.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2023-06-06 00:15:56 Re: Let's make PostgreSQL multi-threaded
Previous Message Peter Geoghegan 2023-06-05 23:50:11 Re: Let's make PostgreSQL multi-threaded