Re: Combination of ordered-set aggregate function terminates JDBC connection on PostgreSQL 9.6.5

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Lukas Eder <lukas(dot)eder(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: Combination of ordered-set aggregate function terminates JDBC connection on PostgreSQL 9.6.5
Date: 2017-10-11 21:17:43
Message-ID: CAKJS1f-EHRN5v2K58D_unR1J0jP6s6TXOSEJPGevUv1JBRtszQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 12 October 2017 at 08:51, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> So the problem arises when nodeAgg.c decides it can combine the transition
> calculations for two different ordered-set aggregates, leading to the
> final functions for those OSAs being invoked successively on the same
> transition state. The finalfns are not expecting that and the second
> one crashes. (Concretely, this happens because osastate->sortstate
> has already been reset to null, after closing down the contained
> tuplesort object.)

Hmm, yeah. It was all coded with the assumption that final functions
never modify the transaction state.

> It seems like this is probably fixable by having the finalfns not do
> tuplesort_end immediately, but rather track whether anyone's yet
> done the sort, and then do something like
>
> if (already_sorted)
> tuplesort_rescan(osastate->sortstate);
> else
> tuplesort_performsort(osastate->sortstate);
>
> However, in order to make use of tuplesort_rescan, we'd have had
> to pass randomAccess = true to tuplesort_begin_xxx, which would
> be rather an annoying overhead for the majority case where there
> isn't a potential for reuse.
>
> What I think we should do about this is introduce another aggregate
> API function, a bit like AggGetAggref or AggCheckCallContext,
> that an aggregate function could call to find out whether there is
> any possibility of multiple invocation of finalfns on the same
> transition state. For the moment I'd just be worried about making
> it work for ordered-set aggs, which are the only case where we don't
> (er, didn't) require that to work anyway. But potentially we could
> extend it to work for all agg cases and then finalfns could be
> optimized in cases where it's safe for them to be destructive
> of the transition state value.

Yeah maybe if core tracked the total number of references in
AggStatePerTrans, and finalize_aggregate() incremented another counter
to track how many times the final function had been called on this
state, then if there was some way to expose that information to the
final function, it would know if it was the first or the last final
function to use the state. I'm just not sure how much we'd want to
allow the final function to see. Would we expose the whole
AggStatePerTrans? or just invent API functions to allow them to know
if they're the first or last?

> Speaking of AggGetAggref, there's another thing that I think 804163bc2
> did wrong for ordered-set aggregates: it can return the wrong Aggref
> when two aggregates' intermediate states have been merged. I do not
> think it's appropriate to say "well, you shouldn't care which of the
> Aggrefs you get". It looks like this accidentally fails to fail
> for the current OSAs, because indeed they do only look at the input-
> related fields of the Aggref, but surely that's not something to
> rely on. It's most certainly not acceptable that the function's
> documentation doesn't mention that its result may be a lie.
>
> This might be a bigger change than we want to push into the back
> branches. In that case probably a back-patchable fix is to hack
> nodeAgg.c so it will never combine input states for OSAs.

I've attached a patch which does this.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
shared_aggregate_state_fix.patch application/octet-stream 832 bytes

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Lukas Fittl 2017-10-11 22:23:07 Re: [HACKERS] Re: BUG #14821: idle_in_transaction_session_timeout sometimes gets ignored when statement timeout is pending
Previous Message Marko Tiikkaja 2017-10-11 21:16:11 Re: BUG #14830: Missed NOTIFications, PostgreSQL 9.1.24