Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17

From: Scott Carey <scott(dot)carey(at)algonomy(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17
Date: 2026-03-31 17:55:42
Message-ID: CA+vubOE6feEWh7XxoWqdPh634SUVjHKW7s9BgMD755wJTqQFPw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, Mar 31, 2026 at 5:03 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

>
> I tried and failed to recreate this locally on 17.9. For me the
> json_agg query is slower than array_agg(). I tried making the table
> 10x bigger and still don't see the same issue. The one with more
> work_mem and fewer batches is always faster for me.
>
> Is the machine under a lot of memory pressure and swapping pages to
> disk? Maybe you need to consider running a lower work_mem setting. How
> much RAM is installed in this machine?
>

The machine has 768GB of RAM and about 35% CPU used at the time of these
tests. It is AlmaLinux 9, with 50GB shared_buffers. OS available memory
is > 600GB (filled with pagecache).
The system is under heavy load, with many large sequential scans on 1GB to
80GB tables with OLAP queries and large batch updates at any given time.
The system RAM buffers disk access relatively well, there is a constant
stream of 100MB/sec to 200MB/sec from disk with bursts to 2000MB/sec off
disk (NVMe RAID) from time to time but iowait is generally low (0.2%).
The problem reproduces on my Ubuntu 25.10 laptop at idle with a near empty
db with 32MB shared_buffers.
It also reproduces on the read-only streaming standby server which is
extremely idle and swimming in just as much RAM..

>
> > set hash_mem_multiplier = 2;
> > set work_mem = "100MB";
> >
> > explain (analyze, buffers) select product_id, array_agg(region_id) from
> array_agg_test group by product_id;
> > QUERY PLAN
> >
> -----------------------------------------------------------------------------------------------------------------------------
> > HashAggregate (cost=4274.00..4771.49 rows=49749 width=40) (actual
> time=4628.278..4643.765 rows=50000 loops=1)
> > Group Key: product_id
> > Batches: 1 Memory Usage: 55649kB
> > Buffers: shared hit=1274
> > -> Seq Scan on array_agg_test (cost=0.00..3274.00 rows=200000
> width=16) (actual time=0.030..16.694 rows=200000 loops=1)
> > Buffers: shared hit=1274
> > Planning Time: 0.067 ms
> > Execution Time: 4648.698 ms
>
> > Below, note json_agg does not have this problem:
> >
> > set hash_mem_multiplier = 2;
> > set work_mem = "500MB";
> >
> > explain (analyze, buffers) select product_id, json_agg(region_id) from
> array_agg_test group by product_id;
> > QUERY PLAN
> >
> -----------------------------------------------------------------------------------------------------------------------------
> > HashAggregate (cost=4274.00..4895.86 rows=49749 width=40) (actual
> time=110.975..122.781 rows=50000 loops=1)
> > Group Key: product_id
> > Batches: 1 Memory Usage: 67089kB
> > Buffers: shared read=1274
> > -> Seq Scan on array_agg_test (cost=0.00..3274.00 rows=200000
> width=16) (actual time=0.034..17.659 rows=200000 loops=1)
> > Buffers: shared read=1274
> > Planning:
> > Buffers: shared hit=10
> > Planning Time: 0.054 ms
> > Execution Time: 124.929 ms
>
> What changed here apart from the aggregate function? Why are the
> buffers being read on this run and not the previous? Same machine? Was
> there a restart?
>

I waited a few minutes, it is a busy server. I can run the example back to
back witn no significant change other than the buffer hits going up, the
pages are in OS page cache if I wait a bit, and even if they are on disk
its nVME SSD raid 10 and the table is 'tiny' for this server.

>
> json_agg allocates slightly more memory per agg state than array_agg.
> You can see that in the reported Hash Aggregate memory usage and I
> expect the actual transition function call between array_agg() and
> json_agg() not to differ very much in cost, so it very much feels like
> something else is going on here.
>

There are some other differences from a default config.
The database was created with
`initdb -E UTF-8`

Other non-default values in postgresql.conf that might be related:
max_files_per_process = 4000 (we have some partitioned tables with a lot
of partitions)
effective_io_concurrency = 16
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'
default_text_search_config = 'pg_catalog.english'

\l+ shows encoding UTF8, Locale Provider libc, Collate en_US.UTF-8 and
Ctype en_US.UTF-8

I don't know what other differences there could be, other than OS. This
reproduces for me on Linux with the above on a RHEL 9 clone (pg 17) or with
Ubuntu 25.10 (pg 16) so I suspect it is not too picky about the distro used.

-Scott

>
> David
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Carey 2026-03-31 18:06:36 Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17
Previous Message David Rowley 2026-03-31 12:03:13 Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17