From: | Richard Guo <riguo(at)pivotal(dot)io> |
---|---|
To: | Antonin Houska <ah(at)cybertec(dot)at> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] WIP: Aggregation push-down |
Date: | 2019-07-15 10:28:15 |
Message-ID: | CAN_9JTzAsBwv5zr4SRVO35OtMWQBXSLcUuY65NQ0ZtFCTOrMMA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jul 12, 2019 at 4:42 PM Antonin Houska <ah(at)cybertec(dot)at> wrote:
> Richard Guo <riguo(at)pivotal(dot)io> wrote:
>
> > I didn't fully follow the whole thread and mainly looked into the
> > latest
> > patch set. So what are the considerations for abandoning the
> > aggmultifn
> > concept?
>
> Originally the function was there to support join where both relations are
> grouped. My doubts about usefulness of such join started at [1]. (See the
> thread referenced from [2].)
>
> > In my opinion, aggmultifn would enable us to do a lot more
> > types of transformation. For example, consider the query below:
> >
> > select sum(foo.c) from foo join bar on foo.b = bar.b group by foo.a,
> > bar.a;
> >
> > With the latest patch, the plan looks like:
> >
> > Finalize HashAggregate <------ sum(psum)
> > Group Key: foo.a, bar.a
> > -> Hash Join
> > Hash Cond: (bar.b = foo.b)
> > -> Seq Scan on bar
> > -> Hash
> > -> Partial HashAggregate <------ sum(foo.c) as
> > psum
> > Group Key: foo.a, foo.b
> > -> Seq Scan on foo
> >
> >
> > If we have aggmultifn, we can perform the query this way:
> >
> > Finalize HashAggregate <------ sum(foo.c)*cnt
> > Group Key: foo.a, bar.a
> > -> Hash Join
> > Hash Cond: (foo.b = bar.b)
> > -> Seq Scan on foo
> > -> Hash
> > -> Partial HashAggregate <------ count(*) as cnt
> > Group Key: bar.a, bar.b
> > -> Seq Scan on bar
>
> Perhaps, but it that would require changes to nodeAgg.c, which I want to
> avoid
> for now.
>
> > And this way:
> >
> > Finalize HashAggregate <------ sum(psum)*cnt
> > Group Key: foo.a, bar.a
> > -> Hash Join
> > Hash Cond: (foo.b = bar.b)
> > -> Partial HashAggregate <------ sum(foo.c) as
> > psum
> > Group Key: foo.a, foo.b
> > -> Seq Scan on foo
> > -> Hash
> > -> Partial HashAggregate <------ count(*) as cnt
> > Group Key: bar.a, bar.b
> > -> Seq Scan on bar
>
> This looks like my idea presented in [1], for which there seems to be no
> common use case.
>
> > My another question is in function add_grouped_path(), when creating
> > sorted aggregation path on top of subpath. If the subpath is not
> > sorted,
> > then the sorted aggregation path would not be generated. Why not in
> > this
> > case we create a sort path on top of subpath first and then create
> > group
> > aggregation path on top of the sort path?
>
> I see no reason not to do it. (If you want to try, just go ahead.) However
> the
> current patch version brings only the basic functionality and I'm not
> going to
> add new functionality (such as parallel aggregation, partitioned tables or
> postgres_fdw) until the open design questions are resolved. Otherwise
> there's
> a risk that the additional work will be wasted due to major rework of the
> core
> functionality.
>
> > Core dump when running one query in agg_pushdown.sql
>
> Thanks for the report! Fixed in the new version.
>
Another core dump for query below:
select sum(t1.s1) from t1, t2, t3, t4 where t1.j1 = t2.j2 group by t1.g1,
t2.g2;
This is due to a small mistake:
diff --git a/src/backend/optimizer/util/relnode.c
b/src/backend/optimizer/util/relnode.c
index 10becc0..9c2deed 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -648,7 +648,7 @@ void
add_grouped_rel(PlannerInfo *root, RelOptInfo *rel, RelAggInfo *agg_info)
{
add_rel_info(root->grouped_rel_list, rel);
- add_rel_info(root->agg_info_list, rel);
+ add_rel_info(root->agg_info_list, agg_info);
}
Thanks
Richard
From | Date | Subject | |
---|---|---|---|
Next Message | Paul Guo | 2019-07-15 10:52:01 | Re: standby recovery fails (tablespace related) (tentative patch and discussion) |
Previous Message | Daniel Gustafsson | 2019-07-15 10:12:25 | Re: POC: converting Lists into arrays |