Re: BUG #15869: Custom aggregation returns null when parallelized

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Kassym Dorsel <k(dot)dorsel(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #15869: Custom aggregation returns null when parallelized
Date: 2019-06-24 22:59:03
Message-ID: CAKJS1f_Qi0iboCos3wu6QiAbdF-9FoK57wxzKbe2-WcesN4rFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, 25 Jun 2019 at 04:07, Kassym Dorsel <k(dot)dorsel(at)gmail(dot)com> wrote:
> Right, adding the Gather node makes it use the combine func and this is where the problem is.

You're mixing up Gather and Parallel Aggregates. Setting
force_parallel_mode to on does not force the aggregate to be
parallelised. It just tries to inject a Gather node at the top of the
plan. I think it was really meant just to test the tuple queues for
parallel query back in 9.6. You're certainly not the only person to
have been confused by it.

> You're right on handling of null values in my combine function. Since this was being run on a table with 150k rows, I had assumed that the contents of my aggregate types would never be null/empty.
>
> Thinking about it, it would make sense to receive an aggregate type with count = 0 or null iff there is 1 worker (1 result to combine the other being null/empty). When there are 2 or more workers I would assume that rows would be relatively evenly split and the return of my aggregate type would be filled given the 150k rows. I tried with 1,2,3,4 workers (ALTER TABLE temp SET (parallel_workers = 1,2,3,4);) and got the same null results before adding support for null values.
>
> Is this expected behavior when number of workers is >=2? An explicit paragraph in parallel aggregates documentation outlining null support in combine func might be helpful.

I don't think anyone would be opposed to improving the documents, but
in this case, it's not the state that was NULL. You don't need to deal
with that since you made your combine function strict. It was your
array elements that were NULL and "<value> <op> NULL" yielding NULL is
fairly fundamental to SQL, not really specific to aggregation. Your
initcond made the q[] array an empty array, so trying to fetch an
element that does not exist will yield NULL. You wouldn't have had the
issue if you'd set all those array elements to 0 in the initcond, but
I've not taken the time to understand your transfn to know if that's
valid. If you've added NULL handling in the combinefn, then that's
likely fine.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2019-06-25 04:49:24 BUG #15871: Regression in 11.4 altering type on column with an index
Previous Message Tom Lane 2019-06-24 19:05:30 Re: BUG #15865: ALTER TABLE statements causing "relation already exists" errors when some indexes exist