| From: | Nikolay Samokhvalov <nik(at)postgres(dot)ai> |
|---|---|
| To: | Rafael Thofehrn Castro <rafaelthca(at)gmail(dot)com> |
| Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Inconsistent increment of pg_stat_database.xact_rollback with logical replication |
| Date: | 2026-04-17 03:58:58 |
| Message-ID: | CAM527d_QWCBhBFYq1ZqvbDnXsHURKhQqBPxxQ=1_BxdfpZcHjg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs pgsql-hackers |
On Thu, Apr 16, 2026 at 7:19 PM Rafael Thofehrn Castro
<rafaelthca(at)gmail(dot)com> wrote:
> Column xact_rollback from pg_stat_database gets inconsistently incremented when logical replication is being used (on publisher side).
...
> This is causing inconsistency in monitoring TPS metric of a database where we eventually see sudden spikes of TPS in the order of millions.
This still reproduces on master.
I agree on the root cause: ReorderBufferProcessTXN() ends each decoded
transaction
with AbortCurrentTransaction() for catalog cleanup; in the walsender
that is a top-level
abort, so AtEOXact_PgStat_Database(isCommit=false) increments the backend-local
pgStatXactRollback.
The counts are flushed to shared stats on walsender exit, producing
an acute spike. Result: for production systems with tight alerting on
xact_rollback, this turns routine logical-replication operations
(disabling a subscription, dropping a slot, walsender restart) into
false-positive pages. Also experienced at GitLab [1][2][3].
Attaching a simple patch that adds a backend-local flag pgStatXactSkipCounters
in pgstat_database.c that AtEOXact_PgStat_Database() honors to skip
the counter bump.
Included a TAP test that fails on master with 5/0 and passes with the patch.
If there is agreement on this shape, happy to send patches for all
supported branches. Let me know what you think.
[1] https://gitlab.com/gitlab-com/gl-infra/production/-/work_items/8290
[2] https://gitlab.com/postgres-ai/postgresql-consulting/tests-and-benchmarks/-/work_items/39
[3] https://gitlab.com/gitlab-org/orbit/knowledge-graph/-/work_items/406
Nik
| Attachment | Content-Type | Size |
|---|---|---|
| v1-xact-rollback-decoding.patch | application/octet-stream | 9.7 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | PG Bug reporting form | 2026-04-17 04:21:18 | BUG #19457: RE: pgp_sym_encrypt silently accepts non-FIPS ciphers (bf, cast5, 3des) when OpenSSL is in FIPS mod |
| Previous Message | surya poondla | 2026-04-16 23:20:46 | Re: BUG #19382: Server crash at __nss_database_lookup |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Lakshmi N | 2026-04-17 04:20:52 | Re: Reject invalid databases in pg_get_database_ddl() |
| Previous Message | jian he | 2026-04-17 03:58:48 | Re: COPY FROM ON_ERROR SET_NULL bypasses domain NOT NULL with partial column list |