Re: Sharing aggregate states between different aggregate functions

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sharing aggregate states between different aggregate functions
Date: 2015-07-28 15:45:27
Message-ID: 55B7A397.4090905@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 07/28/2015 04:14 AM, David Rowley wrote:
> On 27 July 2015 at 20:11, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>
>> On 07/27/2015 08:34 AM, David Rowley wrote:
>>
>>> In this function I also wasn't quite sure if it was with comparing both
>>> non-NULL INITCOND's here. I believe my code comments may slightly
>>> contradict what the code actually does, as the comments talk about them
>>> having to match, but the code just bails if any are non-NULL. The reason I
>>> didn't check them was because it seems inevitable that some duplicate work
>>> needs to be done when setting up the INITCOND. Perhaps it's worth it?
>>
>> It would be nice to handle non-NULL initconds. I think you'll have to
>> check that the input function isn't volatile. Or perhaps just call the
>> input function, and check that the resulting Datum is byte-per-byte
>> identical, although that might be awkward to do with the current code
>> structure.
>
> I've not done anything with this.
> I'd not thought of an input function being volatile before, but I guess
> it's possible, which makes me a bit scared that we could be treading on
> ground we shouldn't be. I know it's more of an output function thing than
> an input function thing, but a GUC like extra_float_digits could cause
> problems here.

Yeah, a volatile input function seems highly unlikely, but who knows.
BTW, we're also not checking if the transition or final functions are
volatile. But that was the same before this patch too.

It sure would be nice to support the built-in float aggregates, so I
took a stab at this. I heavily restructured the code again, so that
there are now two separate steps. First, we check for any identical
Aggrefs that could be shared. If that fails, we proceed to the
permission checks, look up the transition function and build the initial
datum. And then we call another function that tries to find an existing,
compatible per-trans structure. I think this actually looks better than
before, and checking for identical init values is now easy. This does
lose one optimization: if there are two aggregates with identical
transition functions and final functions, they are not merged into a
single per-Agg struct. They still share the same per-Trans struct,
though, and I think that's enough.

How does the attached patch look to you? The comments still need some
cleanup, in particular, the explanations of the different scenarios
don't belong where they are anymore.

BTW, the permission checks were not correct before. You cannot skip the
check on the transition function when you're sharing the per-trans
state. We check that the aggregate's owner has permission to execute the
transition function, and the previous aggregate whose state value we're
sharing might have different owner.

>> Hmm. I think it should be "AggStatePerTransData" then, to keep the same
>> pattern as AggStatePerAggData and AggStatePerGroupData.
>>
> Sounds good. I've renamed it to that in the attached delta patch.

Thanks!

- Heikki

Attachment Content-Type Size
sharing_aggstate-heikki-2.patch binary/octet-stream 75.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2015-07-28 15:52:10 Re: Improving log capture of TAP tests with IPC::Run
Previous Message Andres Freund 2015-07-28 15:25:16 Re: WIP: Make timestamptz_out less slow.