Re: Statistical aggregate functions are not working with PARTIAL aggregation

From: Andres Freund <andres(at)anarazel(dot)de>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Statistical aggregate functions are not working with PARTIAL aggregation
Date: 2019-05-17 03:04:04
Message-ID: 20190517030404.73izvqxgs3amfwd3@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-05-08 13:06:36 +0900, Kyotaro HORIGUCHI wrote:
> This behavior is introduced by 69c3936a14 (in v11). At that time
> FunctionCallInfoData is pallioc0'ed and has fixed length members
> arg[6] and argnull[7]. So nulls[1] is always false even if nargs
> = 1 so the issue had not been revealed.

> After introducing a9c35cf85c (in v12) the same check is done on
> FunctionCallInfoData that has NullableDatum args[] of required
> number of elements. In that case args[1] is out of palloc'ed
> memory so this issue has been revealed.

> In a second look, I seems to me that the right thing to do here
> is setting numInputs instaed of numArguments to numTransInputs in
> combining step.

Yea, to me this just seems a consequence of the wrong
numTransInputs. Arguably this is a bug going back to 9.6, where
combining aggregates where introduced. It's just that numTransInputs
isn't used anywhere for combining aggregates, before 11.

It's documentation says:

/*
* Number of aggregated input columns to pass to the transfn. This
* includes the ORDER BY columns for ordered-set aggs, but not for plain
* aggs. (This doesn't count the transition state value!)
*/
int numTransInputs;

which IMO is violated by having it set to the plain aggregate's value,
rather than the combine func.

While I agree that fixing numTransInputs is the right way, I'm not
convinced the way you did it is the right approach. I'm somewhat
inclined to think that it's wrong that ExecInitAgg() calls
build_pertrans_for_aggref() with a numArguments that's not actually
right? Alternatively I think we should just move the numTransInputs
computation into the existing branch around DO_AGGSPLIT_COMBINE.

It seems pretty clear that this needs to be fixed for v11, it seems too
fragile to rely on trans_fcinfo->argnull[2] being zero initialized.

I'm less sure about fixing it for 9.6/10. There's no use of
numTransInputs for combining back then.

David, I assume you didn't adjust numTransInput plainly because it
wasn't needed / you didn't notice? Do you have a preference for a fix?

Independent of these changes, some of the code around partial, ordered
set and polymorphic aggregates really make it hard to understand things:

/* Detect how many arguments to pass to the finalfn */
if (aggform->aggfinalextra)
peragg->numFinalArgs = numArguments + 1;
else
peragg->numFinalArgs = numDirectArgs + 1;

What on earth is that supposed to mean? Sure, the +1 is obvious, but why
the different sources for arguments are needed isn't - especially
because numArguments was just calculated with the actual aggregate
inputs. Nor is aggfinalextra's documentation particularly elucidating:
/* true to pass extra dummy arguments to aggfinalfn */
bool aggfinalextra BKI_DEFAULT(f);

especially not why aggfinalextra means we have to ignore direct
args. Presumably because aggfinalextra just emulates what direct args
does for ordered set args, but we allow both to be set.

Similarly

/* Detect how many arguments to pass to the transfn */
if (AGGKIND_IS_ORDERED_SET(aggref->aggkind))
pertrans->numTransInputs = numInputs;
else
pertrans->numTransInputs = numArguments;

is hard to understand, without additional comments. One can, looking
around, infer that it's because ordered set aggs need sort columns
included. But that should just have been mentioned.

And to make sense of build_aggregate_transfn_expr()'s treatment of
direct args, one has to know that direct args are only possible for
ordered set aggregates. Which IMO is not obvious in nodeAgg.c.

...

I feel this code has become quite creaky in the last few years.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rushabh Lathia 2019-05-17 03:40:10 behaviour change - default_tablesapce + partition table
Previous Message Alvaro Herrera 2019-05-17 02:34:13 Re: PostgreSQL 12: Feature Highlights