| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Scott Carey <scott(dot)carey(at)algonomy(dot)com> |
| Cc: | David Rowley <dgrowleyml(at)gmail(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17 |
| Date: | 2026-03-31 19:26:09 |
| Message-ID: | 2764544.1774985169@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-performance |
Scott Carey <scott(dot)carey(at)algonomy(dot)com> writes:
>> On Tue, Mar 31, 2026 at 5:03 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>>> I tried and failed to recreate this locally on 17.9. For me the
>>> json_agg query is slower than array_agg(). I tried making the table
>>> 10x bigger and still don't see the same issue. The one with more
>>> work_mem and fewer batches is always faster for me.
>> I don't know what other differences there could be, other than OS. This
>> reproduces for me on Linux with the above on a RHEL 9 clone (pg 17) or with
>> Ubuntu 25.10 (pg 16) so I suspect it is not too picky about the distro used.
Like David, I can't reproduce the described behavior. I tried on
RHEL8/x86_64 and on macOS/M4, and got runtimes that barely vary
across different work_mem settings, all sub-100ms. It should be
noted that I tested v17 branch tip not precisely 17.9 --- but there's
nothing in the commit log to suggest that we changed v17's behavior
since February.
One thing I find interesting is that your results show significantly
more memory consumption as well as runtime. I had to add a run with
work_mem = "200MB" to get the no-batching behavior you show at
work_mem = "100MB", and then my results look like
$ egrep 'Exec|Batches' v17.out
Batches: 1 Memory Usage: 17937kB
Execution Time: 62.494 ms
Planned Partitions: 4 Batches: 5 Memory Usage: 9009kB Disk Usage: 3744kB
Execution Time: 80.044 ms
Planned Partitions: 16 Batches: 17 Memory Usage: 2385kB Disk Usage: 7112kB
Execution Time: 93.572 ms
Planned Partitions: 32 Batches: 33 Memory Usage: 1393kB Disk Usage: 14088kB
Execution Time: 97.021 ms
Planned Partitions: 64 Batches: 65 Memory Usage: 1089kB Disk Usage: 12200kB
Execution Time: 98.887 ms
Execution Time: 120.179 ms
Planned Partitions: 32 Batches: 33 Memory Usage: 1073kB Disk Usage: 14088kB
Execution Time: 98.609 ms
Batches: 1 Memory Usage: 67089kB
Execution Time: 110.035 ms
Execution Time: 82.040 ms
Your memory-usage numbers are integer multiples of mine.
That makes little sense either.
It seems like the planner is choosing the same plans for me as for
you, other than having a higher cutoff for when not to select
batching. So this is an executor issue not a planner issue.
Some thoughts:
* Does it repro without the "vector" extension? Seems unlikely that
that is related, but we're at the grasping-at-straws stage.
* More grasping at straws: is this stock community Postgres, or
some vendor's modification (eg RDS or Aurora)?
* It would be worth doing the EXPLAINs with the SETTINGS option,
just to make sure that there's not some non-default setting you
forgot to mention.
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | David Rowley | 2026-03-31 22:48:55 | Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17 |
| Previous Message | Scott Carey | 2026-03-31 18:06:36 | Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17 |