Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17

From: Scott Carey <scott(dot)carey(at)algonomy(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17
Date: 2026-04-02 07:23:43
Message-ID: CA+vubOGqg7f2E0_OW8EMruAObejnCS1gCgguT+cnwHG9bRxMDQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Apr 1, 2026 at 6:44 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

>
> That is a very strong clue. Check for property differences (e.g.
> with psql's "\l+" and "\drds") between the new test database and
> the database where you see the problem.
>
>
I have discovered the root cause.
This database is old. It pre-dates Postgres 8.4 which introduced
array_agg. Apparently, in some version prior to 8.4 array_agg was added
as a user function, defined as below for bigint:

create AGGREGATE array_agg(
BASETYPE = bigint,
SFUNC = array_append,
STYPE = bigint[],
INITCOND = '{}'
);

So if you create a test database and run the previous test, performance
will be fine and the query will be fast. Then run:
create AGGREGATE array_agg(BASETYPE = bigint, SFUNC = array_append,STYPE =
bigint[], INITCOND = '{}');

It will be slow and reproduce this behavior.

This leaves a few open questions:

Why would this run so much more slowly after updating from postgres 12 to
17? It is a user defined aggregate, although maybe not as optimized as
the intrinsic one it shouldn't behave this way.
If instead I create the same thing with a different name: "array_agg_alt" ,
the performance of that aggregate is awful too when combined with
HashAggregate query plans. So this is not due to name aliasing. How many
other user defined aggregates does this affect? Would it affect simple
ones like an alternate for "min" or only those that grow in size and
accumulate data?

It looks like I can fix this with a simple "drop aggregate
array_agg(bigint)", as the built-in function remains after removing this.
But I am left wondering how many user defined aggregates have a similar
problem.

A bit more history / info in case someone stumbles upon this:

The fact that the problem did not reproduce on a new / fresh database on
the same postgres instance that otherwise had the problem was a huge clue.
I just had to think more about all the ways the databases differ. Server
settings, hardware, background activity -- these could all be ruled out now
as they were the same for both. My first thoughts were related to the
age of the database and all of the upgrade cycles it had been through, and
the size of the database. There are tens of thousands of tables in the
production db (many partition tables). But one of the systems
reproducing the problem identically (my laptop) had the same schema, but
almost no partition tables and only small test data. So I assumed the pure
db size was not to blame.

The fresh database was an almost empty schema, the one with the problem was
large, old, and crufty. Schemas for this system are built from scratch
regularly for testing by running a sequence of schema update files --
almost 2000 of them, essentially a history of all schema updates since the
birth of the database. So I decided to do a binary search 'bisect' on
these 2000 update files, halving the number of candidate changes with each
iteration. It turned out one of the earliest schema changes was the one
to blame, adding a user defined array_agg. This was dated from 2008,
before Postgres had its own built-in array_agg function.

This did not cause any problems until the upgrade from Postgres 12 to 17
triggered this behavior. One of my test systems on Postgres 16 also
reproduces the problem, so I assume this was introduced between version 13
and 16 inclusive.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Richard Guo 2026-04-02 07:34:33 Re: Potential partition pruning regression on PostgreSQL 18
Previous Message David Rowley 2026-04-01 23:00:48 Re: Potential partition pruning regression on PostgreSQL 18