Re: Partition-wise aggregation/grouping

From: Antonin Houska <ah(at)cybertec(dot)at>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Partition-wise aggregation/grouping
Date: 2017-04-28 07:03:58
Message-ID: 23356.1493363038@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:

> On Thu, Apr 27, 2017 at 4:53 PM, Antonin Houska <ah(at)cybertec(dot)at> wrote:
>
> > Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > > Well, I'm confused. I see that there's a relationship between what
> > > Antonin is trying to do and what Jeevan is trying to do, but I can't
> > > figure out whether one is a subset of the other, whether they're both
> > > orthogonal, or something else. This plan looks similar to what I
> > > would expect Jeevan's patch to produce,

> > The point is that the patch Jeevan wanted to work on is actually a subset of
> > [1] combined with [2].

> Seems like, as you are targeting every relation whether or not it is
> partitioned.

Yes.

> With my patch, I am getting following plan where we push entire
> aggregation below append.
>
> QUERY PLAN
> ------------------------------------------
> Append
> -> HashAggregate
> Group Key: b_1.j
> -> Hash Join
> Hash Cond: (b_1.j = c_1.k)
> -> Seq Scan on b_1
> -> Hash
> -> Seq Scan on c_1
> -> HashAggregate
> Group Key: b_2.j
> -> Hash Join
> Hash Cond: (b_2.j = c_2.k)
> -> Seq Scan on b_2
> -> Hash
> -> Seq Scan on c_2
> (15 rows)

I think this is not generic enough because the result of the Append plan can
be joined to another relation. As such a join can duplicate the
already-aggregated values, the aggregates should not be finalized below the
top-level plan.

> Antonin, I have tried applying your patch on master but it doesn't get
> apply. Can you please provide the HEAD and any other changes required
> to be applied first?

I've lost that information. I'll post a new version to the [1] thread asap.

> How the plan look like when GROUP BY key does not match with the
> partitioning key i.e. GROUP BY b.v ?

EXPLAIN (COSTS false)
SELECT b.v, avg(b.v + c.v)
FROM b
JOIN
c ON b.j = c.k
GROUP BY b.v;

QUERY PLAN
------------------------------------------------
Finalize HashAggregate
Group Key: b_1.v
-> Append
-> Partial HashAggregate
Group Key: b_1.v
-> Hash Join
Hash Cond: (b_1.j = c_1.k)
-> Seq Scan on b_1
-> Hash
-> Seq Scan on c_1
-> Partial HashAggregate
Group Key: b_2.v
-> Hash Join
Hash Cond: (b_2.j = c_2.k)
-> Seq Scan on b_2
-> Hash
-> Seq Scan on c_2

> > [1] https://www.postgresql.org/message-id/9666.1491295317%40localhost
> >
> > [2] https://commitfest.postgresql.org/14/994/

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2017-04-28 07:43:34 PQhost may return socket dir for network connection
Previous Message Konstantin Knizhnik 2017-04-28 07:01:22 Bug in prepared statement cache invalidation?