Re: [PATCH] Speed up of vac_update_datfrozenxid.

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Rustam Khamidullin <rstm(dot)khamidullin(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PATCH] Speed up of vac_update_datfrozenxid.
Date: 2025-10-31 17:17:35
Message-ID: 8ea10508-f353-43e2-883e-f41d961d8a44@iki.fi
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 15/07/2025 14:02, Rustam Khamidullin wrote:
> Hi hackers,
>
> While testing behavior of 'vacuum' functionality under high load
> we've noticed that the exclusive lock with 'LOCKTAG_DATABASE_FROZEN_ID'
> tag could be held for a prolonged time, thus causing contention with
> other sessions.
> This lock is obtained by the 'vac_update_datfrozenxid' function at the
> end of vacuum
> process to update value of 'datfrozenxid' in the 'pg_database' relation.
> This function currently performs two operations under the
> mentioned lock: first scanning the 'pg_class' to find minimal
> 'relfrozenxid' and 'relminmxid' and then update corresponding values
> in the 'pg_database' relation. Update of the 'pg_database' table
> includes locking the required tuple and then deciding whether row
> need to be updated or not. While looking at the
> 'vac_update_datfrozenxid' content, we've noticed following comment:
>
> /*
>  * Fetch a copy of the tuple to scribble on. We could check the syscache
>  * tuple first. If that concluded !dirty, we'd avoid waiting on
>  * concurrent heap_update() and would avoid exclusive-locking the buffer.
>  * For now, don't optimize that.
>  */
>
> The optimization described in this comment (i.e. scanning and
> locking tuple only if we really intend to update it) seems to be a
> valid one, so I've created a patch to test it (attached to this
> message). It should be applicable on top of current 'master' branch.
> The code now tries to obtain data from the SysCache first and checks
> whether it's necessary to call 'systable_inplace_update_begin()' to
> actually update xid values.
>
> To test the effect of the patch, a synthetic test was written using
> bpftrace.
> Bpftrace script measures the execution time of 'vac_update_datfrozenxid'.
> The script is attached to the email.
>
> The test scenario:
>
> 1. Create an empty database cluster
> 2. Set autovacuum_naptime = '1s' (so that 'vac_update_datfrozenxid' is
> called more often)
> 3. Run the bpftrace script (it outputs the function execution time in
> ns to stdout)
>
> It is not necessary to apply the load. The autovacuum worker will call
> 'vac_update_datfrozenxid' regardless.
> As a result, execution time of 'vac_update_datfrozenxid()' with the
> patch is 30% faster
>                   No patch         Patch          Speed up
> Mean:         73058 ns        51444 ns      30%
> Median:      71846  ns       48650 ns      33%
>
> It is worth noting that due to the fact thatbefore updating the
> 'datfrozenxid',a full 'pg_class' scan is performed, the performance
> improvementbecomes less significant as the number of
> relationsincreases(more time will be spent on sequentialscan of
> 'pg_class').The previous measurement was performedon an empty
> clusterwith nouser relations.To test the patch in a more realistic case,
> I generated 1000 tables in the database and repeated the measurement. No
> patch Patch Speed up
> Mean:97113 ns 73839 ns 24% Median:98328 ns 63541 ns 29% As we can see,
> the speed gain has decreased, but it is still significant.

How significant is that speedup in the grand scheme of things? What
fraction is vac_update_datfrozenxid of the overall work that autovacuum
does? Did this patch help with the lock contention that you mentioned at
the top?

- Heikki

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2025-10-31 17:33:54 Re: AIX support
Previous Message Bryan Green 2025-10-31 17:16:04 [PATCH] O_CLOEXEC not honored on Windows - handle inheritance chain