Re: Combining Aggregates

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila(at)enterprisedb(dot)com>
Subject: Re: Combining Aggregates
Date: 2015-12-21 12:30:48
Message-ID: CA+TgmoamHaBbLy7AP9bCM7HDVE4ij2SE-gkiQCKvquHADa+adw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 21, 2015 at 4:02 AM, David Rowley
<david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> Ok, so it seems that my mindset was not in parallel process space when I
> was thinking about this. I think having the pointer in the Tuple is
> probably going to be fine for this multiple stage aggregation when that's
> occurring in a single backend process, but obviously if the memory that the
> pointer points to belongs to a worker process in a parallel aggregate
> situation, then bad things will happen.
>
> Now, there has been talk of this previously, on various threads, but I don't
> believe any final decisions were made on how exactly it should be done. At
> the moment I plan to make changes as follows:
>
> Add 3 new columns to pg_aggregate, aggserialfn, aggdeserialfn and
> aggserialtype These will only be required when aggtranstype is INTERNAL.
> Perhaps we should disallow CREATE AGGREAGET from accepting them for any
> other type... The return type of aggserialfn should be aggserialtype, and it
> should take a single parameter of aggtranstype. aggdeserialfn will be the
> reverse of that.
> Add a new bool field to nodeAgg's state named serialStates. If this is field
> is set to true then when we're in finalizeAgg = false mode, we'll call the
> serialfn on the agg state instead of the finalfn. This will allow the
> serialized state to be stored in the tuple and sent off to the main backend.
> The combine agg node should also be set to serialStates = true, so that it
> knows to deserialize instead of just assuming that the agg state is of type
> aggtranstype.
>
> I believe this should allow us to not cause any performance regressions by
> moving away from INTERNAL agg states. It should also be very efficient for
> internal states such as Int8TransTypeData, as this struct merely has 2 int64
> fields which should be very simple to stuff into a bytea varlena type. We
> don't need to mess around converting the ->count and ->sum into a strings or
> anything.
>
> Then in order for the planner to allow parallel aggregation all aggregates
> must:
>
> Not have a DISTINCT or ORDER BY clause
> Have a combinefn
> If aggtranstype = INTERNAL, must have a aggserialfn and aggdeserialfn.
>
> We can relax the requirement on 3 if we're using 2-stage aggregation, but
> not parallel aggregation.
>
> Any objections, or better ideas?

Can we use Tom's expanded-object stuff instead of introducing
aggserialfn and aggdeserialfn? In other words, if you have a
aggtranstype = INTERNAL, then what we do is:

1. Create a new data type that represents the transition state.
2. Use expanded-object notation for that data type when we're just
within a single process, and flatten it when we need to send it
between processes.

One thing to keep in mind is that we also want to be able to support a
plan that involves having one or more remote servers do partial
aggregation, send us the partial values, combine them across servers
and possibly also with locally computed-values, and the finalize the
aggregation. So it would be nice if there were a way to invoke the
aggregate function from SQL and get a transition value back rather
than a final value.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-12-21 12:50:40 Re: A typo in syncrep.c
Previous Message Heikki Linnakangas 2015-12-21 12:11:11 Re: Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule?