Re: Sharing aggregate states between different aggregate functions

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sharing aggregate states between different aggregate functions
Date: 2015-07-27 05:34:11
Message-ID: CAKJS1f-36W-3ODF-0pa2g_hmQVJH5Or-M-o9WPA0DqFp0+UBSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 27 July 2015 at 03:24, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:

> On 07/09/2015 12:44 PM, David Rowley wrote:
>
>> On 15 June 2015 at 12:05, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
>> wrote:
>>
>>
>>> This basically allows an aggregate's state to be shared between other
>>> aggregate functions when both aggregate's transition functions (and a few
>>> other things) match
>>> There's quite a number of aggregates in our standard set which will
>>> benefit from this optimisation.
>>>
>>> After compiling the original patch with another compiler, I noticed a
>> couple of warnings.
>>
>> The attached fixes these.
>>
>
> I spent some time reviewing this. I refactored the ExecInitAgg code rather
> heavily to make it more readable (IMHO); see attached. What do you think?
> Did I break anything?
>

Thanks for taking the time to look at this and makes these fixes.

I'm just looking over your changes:

- ereport(ERROR,
- (errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
- errmsg("aggregate %u needs to have compatible input type and transition
type",
- aggref->aggfnoid)));
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+ errmsg("aggregate needs to have compatible input type and transition
type")));

I can't quite see the reason to remove the agg OID from the error message
here. It seems to be still valid to use as build_peraggstate_for_aggref()
only is called when nothing is shared.

- * agg_input_types, agg_state_type, agg_result_type identify the input,
- * transition, and result types of the aggregate. These should all be
- * resolved to actual types (ie, none should ever be ANYELEMENT etc).
+ * agg_input_types identifies the input types of the aggregate. These
should
+ * be resolved to actual types (ie, none should ever be ANYELEMENT etc).

I'm not sure I understand why agg_state_type and agg_result_type have
changed here.

+ peraggstate->sortstates = (Tuplesortstate **)
+ palloc0(sizeof(Tuplesortstate *) * numGroupingSets);
+ for (currentsortno = 0; currentsortno < numGroupingSets; currentsortno++)
+ peraggstate->sortstates[currentsortno] = NULL;

This was not you, but this NULL setting looks unneeded due to the palloc0().

> Some comments:
>
> * In aggref_has_compatible_states(), you give up if aggtype or aggcollid
> differ. But those properties apply to the final function, so you were
> leaving some money on the table by disallowing state-sharing if they differ.
>

Good catch, and accurate analogy. Thanks for fixing.

>
> * The filter and input expressions are initialized for every AggRef,
> before the deduplication logic kicks in. The AggrefExprState.aggfilter,
> aggdirectargs and args fields really belong to the AggStatePerAggState
> struct instead. This is not a new issue, but now that we have a convenient
> per-aggstate struct to put them in, let's do so.
>

Good idea. I failed to notice that code over there in execQual.c so I agree
that where you've moved it to is much better.

>
> * There was a reference-after free bug in aggref_has_compatible_states;
> you cannot ReleaseSysCache and then continue pointing to the struct.
>

Thanks for fixing.

In this function I also wasn't quite sure if it was with comparing both
non-NULL INITCOND's here. I believe my code comments may slightly
contradict what the code actually does, as the comments talk about them
having to match, but the code just bails if any are non-NULL. The reason I
didn't check them was because it seems inevitable that some duplicate work
needs to be done when setting up the INITCOND. Perhaps it's worth it?

select aggfnoid || '(' || typname || ')',aggtransfn,agginitval
from pg_aggregate
inner join pg_type on aggtranstype = oid
where aggtransfn in (select aggtransfn
from pg_aggregate
group by aggtransfn
having count(*)>1)
order by aggtransfn;

This indicates that everything using float4_accum as a transfn could
benefit from that. I just wasn't sure how far to go.

> * The code was a bit fuzzy on which parts of the per-aggstate are filled
> in at what time. Some of the fields were overwritten every time a match was
> found. With the same values, so no harm done, but I found it confusing. I
> refactored ExecInitAgg in the attached patch to clear that up.
>

> * There API of build_aggregate_fnexprs() was a bit strange now that some
> callers use it to only fill in the final function, some call it to fill
> both the transition functions and the final function. I split it to two
> separate functions.
>
>
That's much better.

> * I wonder if we should do this duplicate elimination at plan time. It's
> very fast, so I'm not worried about that right now, but you had grand plans
> to expand this so that an aggregate could optionally use one of many
> different kinds of state values. At that point, it certainly seems like a
> planning decision to decide which aggregates share state. I think we can
> leave it as it is for now, but it's something to perhaps consider later.
>
>
I don't think I'm going to get the time to work on the "supporting
aggregate" stuff you're talking about, but I think it's a good enough idea
to keep around for the future, so I think this shared aggregate states
stuff probably should go into nodeAgg.c for now. I have to say though, I
was a little surprised to find this code in the executor rather than the
planner when I first started on this.

>
> BTW, the name of the AggStatePerAggStateData struct is pretty horrible.
> The repeated "AggState" feels awkward. Now that I've stared at the patch
> for a some time, it doesn't bother me anymore, but it took me quite a while
> to over that. I'm sure it will for others too. And it's not just that
> struct, the comments talk about "aggregate state", which could be confused
> to mean "AggState", but it actually means AggStatePerAggStateData. I don't
> have any great suggestions, but can you come up a better naming scheme?

I agree, they're horrible. The thing that's causing the biggest problem is
the struct named AggState, which carries state for *all* aggregates, and
has nothing to do with "transition state", so it seems there's two
different meanings if the word "state" and I've used both meanings in the
name for AggStatePerAggStateData.

Perhaps just renaming AggStatePerAggStateData to AggStateTransStateData
would be good enough?

I've attached a delta patch based on your patch, in this I've:

1. Renamed AggStatePerAggStateData to AggStateTransStateData and all
variables using that are renamed to suit better.
2. Removed surplus peraggstate->sortstates[currentsortno] = NULL; (not
related to this patch, but since we're moving that part of the code, we'd
better fix)
3. Put back the missing aggfnoid from the error message.
4. Changed initialize_aggregates() to not pass the states. They're already
in AggState and we're using aggstate->numstates to get the count of the
items in that array, so it seems wrong to allow a different array to ever
be passed in.
5. Changed wording of a few comments to try and reduce confusing of 'state'
and 'transition state'.
6. Renamed AggState.peraggstate to transstates. I pluralised this to try to
reduce confusion of the single state pointers named 'transstate' in the
functions in nodeAgg.c. I did think that peragg should also become peraggs
and pergroup should become pergroups, but didn't change those.

Anything else I changed is self explanatory.

What do you think?

Regards

David Rowley

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
sharing_aggstate-heikki-1_delta1.patch application/octet-stream 42.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Seltenreich 2015-07-27 06:03:29 Re: Failing assertions in indxpath.c, placeholder.c and brin_minmax.c
Previous Message Piotr Stefaniak 2015-07-27 05:33:58 Re: spgist recovery assertion failure