Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17

From: Scott Carey <scott(dot)carey(at)algonomy(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17
Date: 2026-04-01 17:50:53
Message-ID: CA+vubOH8XfoD_Qke0OgrPwhse6rsiAjuen1WWj+AKmB8cpy6Yg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Apr 1, 2026 at 6:44 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Scott Carey <scott(dot)carey(at)algonomy(dot)com> writes:
> > I did not mention a few values that differ between the servers that
> > reproduce this, like autovacuum tuning parameters and
> > maintenance_work_men. adding settings to the explain gives a couple
> more,
> > unlikely to be related to the problem:
>
> > Settings: temp_buffers = '512MB', work_mem = '1000MB',
> > effective_io_concurrency = '16', effective_cache_size = '150GB'
>
> Of course your test case is controlling for work_mem, but
> I wonder whether temp_buffers could affect this. I think
> that those are only used for user-defined temp tables, not
> the temp files a batched hashjoin creates, but maybe I'm
> misremembering.
>
>
modifying temp_buffers does not affect it. The 5 systems reproducing the
problem have various values for the settings above, some unset.

> > While writing this, I decided to test out a few more vector extension
> test
> > cases, and discovered something new and mind boggling: :
> > On systems that reproduces the problem, if I create a new test database,
> > then test the query in that database, the problem does not occur.
>
> That is a very strong clue. Check for property differences (e.g.
> with psql's "\l+" and "\drds") between the new test database and
> the database where you see the problem.
>
>
It is a strong clue that I don't know how to leverage. \l+ has no
differences other than an access privilege.
\drds is new to me, but it reports "did not find any settings" for both
the database with the problem and tne new test one without.

At this point, I wonder if there is some residual strangeness since the
reproducing examples are all 'old' databases that have gone through many
pg_upgrades over the years. They also 'leapt' from version 12 to 17. I
suppose I could try a brand new test case on a v12 system (assuming I can
find a yum repo that still has that or a system that still has the
packages), then upgrade to v17 and see if that reproduces the problem.

I will probably attempt adding debug symbols and linux 'perf' first.

regards, tom lane
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message David Rowley 2026-04-01 23:00:48 Re: Potential partition pruning regression on PostgreSQL 18
Previous Message Tom Lane 2026-04-01 13:44:41 Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17