| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Dmytro Astapov <dastapov(at)gmail(dot)com> |
| Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
| Subject: | Re: array_agg(anyarray) silently produces corrupt results with parallel workers when inputs mix NULL and non-NULL array elements |
| Date: | 2026-04-04 16:41:41 |
| Message-ID: | 4102596.1775320901@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
Dmytro Astapov <dastapov(at)gmail(dot)com> writes:
> array_agg(ARRAY[...]) produces silently corrupted 2-D arrays when the query
> uses parallel partial aggregation and the input arrays contain a mix of
> NULL and non-NULL elements. NULL values appear at the wrong positions in
> the output, and non-NULL values disappear or shift.
Right you are.
> I think that the following two changes to array_agg_array_combine() in
> array_userfuncs.c fix the issue:
> 1. Change the condition guarding the null bitmap block from
> "if (state2->nullbitmap)" to
> "if (state1->nullbitmap || state2->nullbitmap)".
> 2. Change the bitmap reallocation size from
> "state1->aitems + state2->aitems" to
> "Max(state1->aitems + state2->aitems, newnitems)"
> to ensure the bitmap is always large enough.
This seems basically right, but I think we could simplify the code
some more: AFAICS the required bitmap size is newnitems, full stop.
The initial-setup path is confused about that too, allocating
newnitems+1 which is pointless.
It also troubled me that there's no checks for integer overflow
when calculating the new sizes. I believe that the pg_nextpower2_32
bits are okay even with large inputs (we'll end in palloc rejecting
the request size if there's an overflow), but if reqsize or newnitems
overflows it could be bad.
So I end with the attached revised patch, where I also made one
or two cosmetic adjustments like putting the type-comparison checks
next to the dimension comparisons. Look good to you?
regards, tom lane
| Attachment | Content-Type | Size |
|---|---|---|
| v2-fix_array_agg_parallel_nullbitmap.patch | text/x-diff | 3.3 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Lukas Fittl | 2026-04-04 18:52:04 | Re: pg_plan_advice fails when NestLoop outer side is Sort over FunctionScan |
| Previous Message | Tomas Vondra | 2026-04-04 14:45:41 | Re: BUG #19449: Massive performance degradation for complex query on Postgres 16+ (few seconds -> multiple hours) |