Re: Combination of ordered-set aggregate function terminates JDBC connection on PostgreSQL 9.6.5

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Lukas Eder <lukas(dot)eder(at)gmail(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Subject: Re: Combination of ordered-set aggregate function terminates JDBC connection on PostgreSQL 9.6.5
Date: 2017-10-11 19:51:32
Message-ID: 23270.1507751492@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Lukas Eder <lukas(dot)eder(at)gmail(dot)com> writes:
> [ this crashes: ]
> select
> cume_dist(1) within group (order by a desc),
> rank(1) within group (order by a desc),
> dense_rank(1) within group (order by a asc),
> percent_rank(1) within group (order by a asc)
> from (values(1)) t(a);
>
> The issue depends on a certain set of combinations of the above function
> calls. Each function can be called individually without problems. Some
> functions can be combined without problems as well.

So the problem arises when nodeAgg.c decides it can combine the transition
calculations for two different ordered-set aggregates, leading to the
final functions for those OSAs being invoked successively on the same
transition state. The finalfns are not expecting that and the second
one crashes. (Concretely, this happens because osastate->sortstate
has already been reset to null, after closing down the contained
tuplesort object.)

It seems like this is probably fixable by having the finalfns not do
tuplesort_end immediately, but rather track whether anyone's yet
done the sort, and then do something like

if (already_sorted)
tuplesort_rescan(osastate->sortstate);
else
tuplesort_performsort(osastate->sortstate);

However, in order to make use of tuplesort_rescan, we'd have had
to pass randomAccess = true to tuplesort_begin_xxx, which would
be rather an annoying overhead for the majority case where there
isn't a potential for reuse.

What I think we should do about this is introduce another aggregate
API function, a bit like AggGetAggref or AggCheckCallContext,
that an aggregate function could call to find out whether there is
any possibility of multiple invocation of finalfns on the same
transition state. For the moment I'd just be worried about making
it work for ordered-set aggs, which are the only case where we don't
(er, didn't) require that to work anyway. But potentially we could
extend it to work for all agg cases and then finalfns could be
optimized in cases where it's safe for them to be destructive
of the transition state value.

Speaking of AggGetAggref, there's another thing that I think 804163bc2
did wrong for ordered-set aggregates: it can return the wrong Aggref
when two aggregates' intermediate states have been merged. I do not
think it's appropriate to say "well, you shouldn't care which of the
Aggrefs you get". It looks like this accidentally fails to fail
for the current OSAs, because indeed they do only look at the input-
related fields of the Aggref, but surely that's not something to
rely on. It's most certainly not acceptable that the function's
documentation doesn't mention that its result may be a lie.

This might be a bigger change than we want to push into the back
branches. In that case probably a back-patchable fix is to hack
nodeAgg.c so it will never combine input states for OSAs.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2017-10-11 21:11:20 Re: [HACKERS] Re: BUG #14821: idle_in_transaction_session_timeout sometimes gets ignored when statement timeout is pending
Previous Message Tom Lane 2017-10-11 17:00:18 Re: BUG #14830: Missed NOTIFications, PostgreSQL 9.1.24