Add per-backend lock statistics

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Add per-backend lock statistics
Date: 2026-06-03 13:58:41
Message-ID: aiAzEY+cMQb/W8yu@bdtpg
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

Now that we have global lock statistics since 4019f725f5d, it could be useful
to have the same kind of information on a per-backend basis.

Indeed, pg_stat_lock gives us cluster-wide aggregates: total waits, total wait
time, total fast-path exceeded across all backends since last reset.

When we see high numbers, we can't answer:

- Which backend is affected the most?
- Is it one backend affected or many?
- Is a specific application or connection pool suffering?
- After a specific workload/application is improved, did its lock behavior
improve?

With per-backend lock stats, we could:

1/ Isolate problematic sessions. We can correlate locks behavior with specific
PIDs visible in pg_stat_activity: identify the exact application_name or user
experiencing lock waits.

2/ Debug live contention. During an incident, we could pinpoint which backends
are experiencing fast-path exhaustion or lock waits without having to reset
global stats and lose history.

3/ Define workload characterization. Different backend types may have very
different lock profiles. Per-backend stats would let us see this directly.

4/ Compare before/after per session. We could measure a single backend's lock
behavior across a specific operation, which is impossible with global counters
that include metrics from all other backends.

IO and WAL stats already have per-backend counterparts (pg_stat_get_backend_io(),
pg_stat_get_backend_wal()). Lock stats are the same class of operational data:
having them only at the global level is an inconsistency that limits observability.

As far the technical implementation:

This data can be retrieved with a new system function called
pg_stat_get_backend_lock(), that returns one tuple per lock type based on the PID
provided in input.

pgstat_flush_backend() gains a new flag value, able to control the flush of the
lock stats.

This patch relies mostly on the infrastructure provided by 9aea73fc61d4, that
has introduced backend statistics.

The overhead (2 functions calls and counters increments) on the hot path (normal
lock acquisition) is zero: counters are only incremented on paths that are already
"slow" (post deadlock timeout waits, fast-path slot exhaustion) and does not add
that much memory per-backend: PgStat_PendingLock is 288 bytes.

The patch is made of 2 sub-patches:

0001: Refactor pg_stat_get_lock() to use a helper function

Extract the tuple-building logic from pg_stat_get_lock() into a new
static helper pg_stat_lock_build_tuples(). This is in preparation for
pg_stat_get_backend_lock() which will reuse the same helper, following
the pattern established by pg_stat_io_build_tuples() for IO stats and
pg_stat_wal_build_tuple() for WAL stats.

0002: Add per-backend lock statistics

As discussed above.

Looking forward to your feedback,

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v1-0001-Refactor-pg_stat_get_lock-to-use-a-helper-functio.patch text/x-diff 2.9 KB
v1-0002-Add-per-backend-lock-statistics.patch text/x-diff 13.5 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Zsolt Parragi 2026-06-03 13:58:51 Re: Proposal: Conflict log history table for Logical Replication
Previous Message Matthias van de Meent 2026-06-03 13:35:00 Re: Init connection time grows quadratically