Re: [PATCH] Negative Transition Aggregate Functions (WIP)

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Date: 2014-02-24 16:50:10
Message-ID: CAEZATCV78Ooift+fd7RtL9LnrnxFDk--TLPPxiTwfgZoiaKyFg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 20 February 2014 01:48, Florian Pflug <fgp(at)phlo(dot)org> wrote:
> On Jan29, 2014, at 13:45 , Florian Pflug <fgp(at)phlo(dot)org> wrote:
>> In fact, I'm
>> currently leaning towards just forbidding non-strict forward transition
>> function with strict inverses, and adding non-NULL counters to the
>> aggregates that then require them. It's really only the SUM() aggregates
>> that are affected by this, I think.
>
> I finally got around to doing that, and the results aren't too bad. The
> attached patches required that the strictness settings of the forward and
> reverse transition functions agree, and employ exactly the same NULL-skipping
> logic we always had.
>
> The only aggregates seriously affected by that change were SUM(int2) and
> SUM(int4).
>
> The SUM, AVG and STDDEV aggregates which use NumericAggState where
> already mostly prepared for this - all they required were a few adjustments
> to correctly handle the last non-NULL, non-NaN input being removed, and a few
> additional PG_ARGISNULL calls for the inverse transition functions since they're
> now non-strict. I've also modified them to unconditionally allocate the state
> at the first call, instead upon seeing the first non-NULL input, but that isn't
> strictly required. But without that, the state can have three classes of values -
> SQL-NULL, NULL pointer and valid pointer, and that's just confusing...
>
> SUM(int2) and SUM(int4) now simply use the same transition functions as
> AVG(int2) and AVG(int4), which use an int8 array to track the sum of the inputs
> and the number of inputs, plus a new final function int2int4_sum(). Previously,
> they used a single int8 as their state type.
>
> Since I was touching the code anyway, I removed some unnecessary inverse
> transition functions - namely, int8_avg_accum_inv and numeric_avg_accum_inv. These
> are completely identical to their non-avg cousins - the only difference between
> the corresponding forward transition functions is whether they request computation
> of sumX2 (i.e. the sum of squares of the inputs) or not.
>
> I haven't yet updated the docs - it'll do that if and when there's consensus
> about whether this is the way to go or not.
>

I haven't looked at this in any detail yet, but that seems much neater
to me. It seems perfectly sensible that the forward and inverse
transition functions should have the same strictness settings, and
enforcing that keeps the logic simple, as well as hopefully making it
easier to document.

It's a shame that more transition functions cannot be made strict,
when they actually ignore null values, but I think trying to solve
that can be regarded as outside the scope of this patch.

Regards,
Dean

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2014-02-24 17:09:37 Re: [SQL] Comparison semantics of CHAR data type
Previous Message Andres Freund 2014-02-24 16:24:28 Re: Patch: show xid and xmin in pg_stat_activity and pg_stat_replication