Re: Statistical aggregate functions are not working with PARTIAL aggregation

From: Andres Freund <andres(at)anarazel(dot)de>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Statistical aggregate functions are not working with PARTIAL aggregation
Date: 2019-05-18 19:37:38
Message-ID: 20190518193738.kka4md4pbzylqeas@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

David, anyone, any comments?

On 2019-05-16 20:04:04 -0700, Andres Freund wrote:
> On 2019-05-08 13:06:36 +0900, Kyotaro HORIGUCHI wrote:
> > This behavior is introduced by 69c3936a14 (in v11). At that time
> > FunctionCallInfoData is pallioc0'ed and has fixed length members
> > arg[6] and argnull[7]. So nulls[1] is always false even if nargs
> > = 1 so the issue had not been revealed.
>
> > After introducing a9c35cf85c (in v12) the same check is done on
> > FunctionCallInfoData that has NullableDatum args[] of required
> > number of elements. In that case args[1] is out of palloc'ed
> > memory so this issue has been revealed.
>
> > In a second look, I seems to me that the right thing to do here
> > is setting numInputs instaed of numArguments to numTransInputs in
> > combining step.
>
> Yea, to me this just seems a consequence of the wrong
> numTransInputs. Arguably this is a bug going back to 9.6, where
> combining aggregates where introduced. It's just that numTransInputs
> isn't used anywhere for combining aggregates, before 11.
>
> It's documentation says:
>
> /*
> * Number of aggregated input columns to pass to the transfn. This
> * includes the ORDER BY columns for ordered-set aggs, but not for plain
> * aggs. (This doesn't count the transition state value!)
> */
> int numTransInputs;
>
> which IMO is violated by having it set to the plain aggregate's value,
> rather than the combine func.
>
> While I agree that fixing numTransInputs is the right way, I'm not
> convinced the way you did it is the right approach. I'm somewhat
> inclined to think that it's wrong that ExecInitAgg() calls
> build_pertrans_for_aggref() with a numArguments that's not actually
> right? Alternatively I think we should just move the numTransInputs
> computation into the existing branch around DO_AGGSPLIT_COMBINE.
>
> It seems pretty clear that this needs to be fixed for v11, it seems too
> fragile to rely on trans_fcinfo->argnull[2] being zero initialized.
>
> I'm less sure about fixing it for 9.6/10. There's no use of
> numTransInputs for combining back then.
>
> David, I assume you didn't adjust numTransInput plainly because it
> wasn't needed / you didn't notice? Do you have a preference for a fix?

Unless somebody comments I'm later today going to move the numTransInput
computation into the DO_AGGSPLIT_COMBINE branch in
build_pertrans_for_aggref(), add a small test (using
enable_partitionwise_aggregate), and backpatch to 11.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-05-18 19:45:11 Re: Multivariate MCV stats can leak data to unprivileged users
Previous Message Jeff Janes 2019-05-18 19:28:59 improve transparency of bitmap-only heap scans