Re: Combining Aggregates

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: David Steele <david(at)pgmasters(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila(at)enterprisedb(dot)com>
Subject: Re: Combining Aggregates
Date: 2016-03-16 11:03:58
Message-ID: CAKJS1f-Xc2a2y52+c+_SoC-Ki=XfHxHLz2raFOm4bgi_iMqyiQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 16 March 2016 at 10:34, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> If Haribabu's patch does all that's required for the numerical
> aggregates for floating point types then the status of covered
> aggregates is (in order of pg_aggregate.h):
>
> * AVG() complete coverage
> * SUM() complete coverage
> * MAX() complete coverage
> * MIN() complete coverage
> * COUNT() complete coverage
> * STDDEV + friends complete coverage
> * regr_*,covar_pop,covar_samp,corr not touched these.
> * bool*() complete coverage
> * bitwise aggs. complete coverage
> * Remaining are not touched. I see diminishing returns with making
> these parallel for now. I think I might not be worth pushing myself
> any harder to make these ones work.
>
> Does what I have done + floating point aggs from Haribabu seem
> reasonable for 9.6?

I've attached a series of patches.

Patch 1:
This is the parallel aggregate patch, not intended for review here.
However, all further patches are based on this, and this adds the
required planner changes to make it possible to test patches 2 and 3.

Patch 2:
This adds the serial/deserial aggregate infrastructure, pg_dump
support, CREATE AGGREGATE changes, and nodeAgg.c changes to have it
serialise and deserialise aggregate states when instructed to do so.

Patch 3:
This adds a boat load of serial/deserial functions, and combine
functions for most of the built-in numerical aggregate functions. It
also contains some regression tests which should really be in patch 2,
but I with patch 2 there's no suitable serialisation or
de-serialisation functions to test CREATE AGGREGATE with. I think
having them here is ok, as patch 2 is quite useless without patch 3
anyway.

Another thing to note about this patch is that I've gone and created
serial/de-serial functions for when PolyNumAggState both require
sumX2, and don't require sumX2. I had thought about perhaps putting an
extra byte in the serial format to indicate if a sumX2 is included,
but I ended up not doing it this way. I don't really want these serial
formats getting too complex as we might like to do fun things like
pass them along to sharded servers one day, so it might be nice to
keep them simple.

Patch 4:
Adds a bunch of opr_sanity regression tests. This could be part of
patch 3, but 3 was quite big already.

Patch 5:
Adds some documentation to indicate which aggregates allow partial mode.

Comments welcome.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Allow-aggregation-to-happen-in-parallel_2016-03-16.patch application/octet-stream 45.7 KB
0002-Allow-INTERNAL-state-aggregates-to-participate-in-pa_2016-03-16.patch application/octet-stream 94.1 KB
0003-Add-various-aggregate-combine-and-serialize-de-seria_2016-03-16.patch application/octet-stream 58.1 KB
0004-Add-sanity-regression-tests-for-new-aggregate-serial_2016-03-16.patch application/octet-stream 4.7 KB
0005-Add-documents-to-explain-which-aggregates-support-pa_2016-03-16.patch application/octet-stream 16.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2016-03-16 11:08:49 Re: Combining Aggregates
Previous Message Haribabu Kommi 2016-03-16 10:54:44 Re: Combining Aggregates