Re: [HACKERS] WIP: Aggregation push-down

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Richard Guo <riguo(at)pivotal(dot)io>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] WIP: Aggregation push-down
Date: 2019-07-12 08:41:51
Message-ID: 13971.1562920911@spoje.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Richard Guo <riguo(at)pivotal(dot)io> wrote:

> I didn't fully follow the whole thread and mainly looked into the
> latest
> patch set. So what are the considerations for abandoning the
> aggmultifn
> concept?

Originally the function was there to support join where both relations are
grouped. My doubts about usefulness of such join started at [1]. (See the
thread referenced from [2].)

> In my opinion, aggmultifn would enable us to do a lot more
> types of transformation. For example, consider the query below:
>
> select sum(foo.c) from foo join bar on foo.b = bar.b group by foo.a,
> bar.a;
>
> With the latest patch, the plan looks like:
>
> Finalize HashAggregate    <------ sum(psum)
>    Group Key: foo.a, bar.a
>    ->  Hash Join
>          Hash Cond: (bar.b = foo.b)
>          ->  Seq Scan on bar
>          ->  Hash
>                ->  Partial HashAggregate    <------ sum(foo.c) as
> psum
>                      Group Key: foo.a, foo.b
>                      ->  Seq Scan on foo
>
>
> If we have aggmultifn, we can perform the query this way:
>
> Finalize HashAggregate    <------ sum(foo.c)*cnt
>    Group Key: foo.a, bar.a
>    ->  Hash Join
>          Hash Cond: (foo.b = bar.b)
>          ->  Seq Scan on foo
>          ->  Hash
>                ->  Partial HashAggregate    <------ count(*) as cnt
>                      Group Key: bar.a, bar.b
>                      ->  Seq Scan on bar

Perhaps, but it that would require changes to nodeAgg.c, which I want to avoid
for now.

> And this way:
>
> Finalize HashAggregate    <------ sum(psum)*cnt
>    Group Key: foo.a, bar.a
>    ->  Hash Join
>          Hash Cond: (foo.b = bar.b)
>                ->  Partial HashAggregate    <------ sum(foo.c) as
> psum
>                      Group Key: foo.a, foo.b
>                      ->  Seq Scan on foo
>          ->  Hash
>                ->  Partial HashAggregate    <------ count(*) as cnt
>                      Group Key: bar.a, bar.b
>                      ->  Seq Scan on bar

This looks like my idea presented in [1], for which there seems to be no
common use case.

> My another question is in function add_grouped_path(), when creating
> sorted aggregation path on top of subpath. If the subpath is not
> sorted,
> then the sorted aggregation path would not be generated. Why not in
> this
> case we create a sort path on top of subpath first and then create
> group
> aggregation path on top of the sort path?

I see no reason not to do it. (If you want to try, just go ahead.) However the
current patch version brings only the basic functionality and I'm not going to
add new functionality (such as parallel aggregation, partitioned tables or
postgres_fdw) until the open design questions are resolved. Otherwise there's
a risk that the additional work will be wasted due to major rework of the core
functionality.

> Core dump when running one query in agg_pushdown.sql

Thanks for the report! Fixed in the new version.

> This is really a cool feature. Thank you for working on this.

I appreciate to hear that :-) Let's see which postgres release will adopt it.

[1] https://www.postgresql.org/message-id/cc823e89-3fbc-f94e-b9d4-9c713b044b5d%402ndquadrant.com

[2] https://www.postgresql.org/message-id/flat/9666.1491295317%40localhost

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

Attachment Content-Type Size
v13-0001-Introduce-RelInfoList-structure.patch text/x-diff 53.7 KB
v13-0002-Introduce-make_join_rel_common-function.patch text/x-diff 2.9 KB
v13-0003-Aggregate-push-down-basic-functionality.patch text/x-diff 108.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message tushar 2019-07-12 09:23:21 Re: Minimal logical decoding on standbys
Previous Message Kyotaro Horiguchi 2019-07-12 08:37:25 Re: [HACKERS] WAL logging problem in 9.4.3?