Re: Combining Aggregates

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila(at)enterprisedb(dot)com>
Subject: Re: Combining Aggregates
Date: 2016-01-21 01:32:52
Message-ID: CAKJS1f9eH172rz0-3YXRZg+SsU+UkQB_uUkXHYdbaddaiYVcmw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 21 January 2016 at 04:59, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Wed, Jan 20, 2016 at 7:53 AM, David Rowley
> <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> > On 21 January 2016 at 01:44, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >>
> >> On Wed, Jan 20, 2016 at 7:38 AM, David Rowley
> >> <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> >> >> To my mind, priority #1 ought to be putting this fine new
> >> >> functionality to some use. Expanding it to every aggregate we've got
> >> >> seems like a distinctly second priority. That's not to say that it's
> >> >> absolutely gotta go down that way, but those would be my priorities.
> >> >
> >> > Agreed. So I've attached a version of the patch which does not have
> any
> >> > of
> >> > the serialise/deserialise stuff in it.
> >> >
> >> > I've also attached a test patch which modifies the grouping planner to
> >> > add a
> >> > Partial Aggregate node, and a final aggregate node when it's possible.
> >> > Running the regression tests with this patch only shows up variances
> in
> >> > the
> >> > EXPLAIN outputs, which is of course expected.
> >>
> >> That seems great as a test, but what's the first patch that can put
> >> this to real and permanent use?
> >
> > There's no reason why parallel aggregates can't use the
> > combine_aggregate_state_d6d480b_2016-01-21.patch patch.
>
> I agree. Are you going to work on that? Are you expecting me to work
> on that? Do you think we can use Haribabu's patch? What other
> applications are in play in the near term, if any?

At the moment I think everything which will use this is queued up behind
the pathification of the grouping planner which Tom is working on. I think
naturally Parallel Aggregate makes sense to work on first, given all the
other parallel stuff in this release. I plan on working on that that by
either assisting Haribabu, or... whatever else it takes.

The other two usages which I have thought of are;

1) Aggregating before UNION ALL, which might be fairly simple after the
grouping planner changes, as it may just be a matter of considering another
"grouping path" which partially aggregates before the UNION ALL, and
performs the final grouping stage after UNION ALL. At this stage it's hard
to say how that will work as I'm not sure how far changes to the grouping
planner will go. Perhaps Tom can comment?

2) Group before join. e.g select p.description,sum(s.qty) from sale s inner
join s.product_id = p.product_id group by s.product_id group by
p.description; I have a partial patch which implements this, although I
was a bit stuck on if I should invent the concept of "GroupingPaths", or
just inject alternative subquery relations which are already grouped by the
correct clause. The problem with "GroupingPaths" was down to the row
estimates currently come from the RelOptInfo and is set
in set_baserel_size_estimates() which always assumes the ungrouped number
of rows, which is not what's needed if the grouping is already performed. I
was holding off to see how Tom does this in the grouping planner changes.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2016-01-21 01:41:46 Re: Combining Aggregates
Previous Message Simon Riggs 2016-01-21 01:20:24 pgsql: Refactor to create generic WAL page read callback