Re: [PATCH] Negative Transition Aggregate Functions (WIP)

From: Florian Pflug <fgp(at)phlo(dot)org>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Date: 2014-01-16 02:47:51
Message-ID: 357A911D-DC0E-42FC-90AB-43A2015FA1C9@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Jan16, 2014, at 02:32 , Florian Pflug <fgp(at)phlo(dot)org> wrote:
> On Jan14, 2014, at 17:39 , Florian Pflug <fgp(at)phlo(dot)org> wrote:
>> On Jan14, 2014, at 11:06 , David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>>> Here's a patch which removes sum(numeric) and changes the documents a little to remove a reference to using sum(numeric) to workaround the fact that there's no inverse transitions for sum(float). I also made a small change in the aggregates.sql tests to check that the aggfnoid <= 9999.
>>
>> I've looked over the patch, here a few comments.
>>
>> For STRICT pairs of transfer and inverse transfer functions we should complain if any of them ever return NULL. That can never be correct anyway, since a STRICT function cannot undo a non-NULL -> NULL transition.
>>
>> The same goes for non-STRICT, at least if we ever want to allow an inverse transfer function to indicate "Sorry, cannot undo in this case, please rescan". If we allowed NULL as just another state value now, we couldn't easily undo that later, so we'd rob ourselves of the obvious way for the inverse transfer function to indicate this condition to its caller.
>>
>> The notnullcount machinery seems to apply to both STRICT and non-STRICT transfer function pairs. Shouldn't that be constrained to STRICT transfer function pairs? For non-STRICT pairs, it's up to the transfer functions to deal with NULL inputs however they please, no?
>
> I modified the patch to do this, and ran into a problem. Currently, aggregates with state type "internal" cannot have a strict transfer function, even if they behave like they did, i.e. ignore non-NULL inputs. This is because the only possible initial value for state type "internal" is NULL, and it's up to the transfer function to create the state - usually upon seeing the first non-NULL input. Now, currently that isn't a huge problem - the transfer function simply has to check for NULL input values itself, and if it's indeed conceptually strict, it just returns in this case.
>
> With inverse transfer functions, however, each such pair of forward and inverse transfer functions would also need to duplicate the NULL-counting logic from nodeWindowAgg.c if it want's to be conceptually strict, i.e. ignore NULL inputs, but return NULL if there are *only* NULL inputs (or no inputs at all). That seems like a lot of duplicated code in the long run.
>
> In essence, what we'd want for strict pairs of forward and inverse transfer functions is for the forward transfer function to be strict in all arguments except the state, and the inverse to be strict according to the usual definition. We can't express that property of the forward transfer function within pg_proc, but we don't really have to - a local hack in nodeWindowAgg.c suffices. So what I'm proposing is:
>
> We allow the forward transfer function to be non-strict even if the inverse is strict, but only if the initial value is NULL. In that case we behave as if the forward transfer function was strict, except that upon seeing the first non-NULL input we call it with a NULL state. The return value must still be non-NULL in all cases.

Ok, I tried this and it worked out quite OK.

Updated patch is attached. It passes regression tests, but those currently don't seem to include any aggregates which *don't* ignore NULL values, so that case is probably untested. Also, it allows non-strict forward transfer functions together with strict inverse transfer functions even for non-NULL initial states now. It seemed rather pedantic to forbid this.

BTW, as it stands, the patch currently uses the inverse transition function only when *all* the aggregates belonging to one window clause provide one. After working with the code a bit, I think that a bit of reshuffling would allow that distinction to be made on a per-aggregate basis. The question is, is it worth it?

best regards,
Florian Pflug

Attachment Content-Type Size
inverse_transition_functions_v2.2_fgp.patch.gz application/x-gzip 21.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Kara 2014-01-16 02:58:46 Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Previous Message Robert Haas 2014-01-16 02:37:16 Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance