Re: Combining Aggregates

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila(at)enterprisedb(dot)com>
Subject: Re: Combining Aggregates
Date: 2015-12-22 00:53:08
Message-ID: CAKJS1f9rmPrsXdnF14nxg6N7PcO+pZmtQGH3GmXuy0q-Vz4kXQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 22 December 2015 at 01:30, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> Can we use Tom's expanded-object stuff instead of introducing
> aggserialfn and aggdeserialfn? In other words, if you have a
> aggtranstype = INTERNAL, then what we do is:
>
> 1. Create a new data type that represents the transition state.
> 2. Use expanded-object notation for that data type when we're just
> within a single process, and flatten it when we need to send it
> between processes.
>
>
I'd not seen this before, but on looking at it I'm not sure if using it
will be practical to use for this. I may have missed something, but it
seems that after each call of the transition function, I'd need to ensure
that the INTERNAL state was in the varlana format. This might be ok for a
state like Int8TransTypeData, since that struct has no pointers, but I
don't see how that could be done efficiently for NumericAggState, which has
two NumericVar, which will have pointers to other memory. The trans
function also has no idea whether it'll be called again for this state, so
it does not seem possible to delay the conversion until the final call of
the trans function.

> One thing to keep in mind is that we also want to be able to support a
> plan that involves having one or more remote servers do partial
> aggregation, send us the partial values, combine them across servers
> and possibly also with locally computed-values, and the finalize the
> aggregation. So it would be nice if there were a way to invoke the
> aggregate function from SQL and get a transition value back rather
> than a final value.

This will be possible with what I proposed. The Agg Node will just need to
be setup with finalizeAggs=false, serialState=true. That way the returned
aggregate values will be the states converted into the serial type, to
which we can call the output function on and send where ever we like.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2015-12-22 01:40:38 Re: Experimental evaluation of PostgreSQL's query optimizer
Previous Message David Rowley 2015-12-21 23:38:15 Re: Parallel Aggregate