Re: BUG #19439: pg_stat_xact_user_tables stat not currect during the transaction

From: klemen kobau <klemen(dot)kobau(at)gmail(dot)com>
To: Xuneng Zhou <xunengzhou(at)gmail(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #19439: pg_stat_xact_user_tables stat not currect during the transaction
Date: 2026-05-06 08:03:08
Message-ID: CAOGhw3iNj+v-08hsfQw+5eUitCicDAHJaaJ++b9Ym0EOrMn0Gg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi Xuneng,

Thanks for the analysis and the patch proposals. As the original reporter,
I wanted to follow up since the thread has been quiet for over a month.

I am new to this process, do I need to do anything to help the process?

Kind regards

Klemen Kobau

On Wed, 6 May 2026 at 09:56, Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:

> On Sun, Mar 29, 2026 at 11:17 AM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
> >
> > On Sat, Mar 28, 2026 at 1:47 PM Xuneng Zhou <xunengzhou(at)gmail(dot)com>
> wrote:
> > >
> > > On Fri, Mar 27, 2026 at 10:15 AM Xuneng Zhou <xunengzhou(at)gmail(dot)com>
> wrote:
> > > >
> > > > Hi klemen,
> > > >
> > > > Thanks for the report.
> > > >
> > > > On Fri, Mar 27, 2026 at 5:36 AM PG Bug reporting form
> > > > <noreply(at)postgresql(dot)org> wrote:
> > > > >
> > > > > The following bug has been logged on the website:
> > > > >
> > > > > Bug reference: 19439
> > > > > Logged by: klemen kobau
> > > > > Email address: klemen(dot)kobau(at)gmail(dot)com
> > > > > PostgreSQL version: 18.0
> > > > > Operating system: Linux (EndeavorOS)
> > > > > Description:
> > > > >
> > > > > I am running postgres:18.0 in a docker container, the
> configuration is as
> > > > > follows:
> > > > >
> > > > > postgres:
> > > > > image: postgres:18.0
> > > > > command: [
> > > > > "postgres",
> > > > > "-N", "200",
> > > > > "-c", "max_prepared_transactions=100",
> > > > > "-c", "wal_level=logical",
> > > > > "-c", "max_wal_senders=10",
> > > > > "-c", "max_replication_slots=20",
> > > > > "-c", "wal_keep_size=10",
> > > > > "-c", "max_slot_wal_keep_size=1024"
> > > > > ]
> > > > > environment:
> > > > > POSTGRES_USER: postgres
> > > > > POSTGRES_PASSWORD: postgres
> > > > > POSTGRES_DB: postgres
> > > > > TZ: UTC
> > > > > PGTZ: UTC
> > > > > ports:
> > > > > - 5432:5432
> > > > > volumes:
> > > > > - postgres_data:/var/lib/postgresql
> > > > >
> > > > > I use psql version 18.3.
> > > > >
> > > > > I run the following:
> > > > >
> > > > > psql -h localhost -p 5432 -U postgres -d postgres -c "
> > > > > -- Transaction 1: insert 1 row, check stats, commit
> > > > > BEGIN;
> > > > > CREATE TABLE IF NOT EXISTS temp.xact_test (id serial PRIMARY KEY,
> val text);
> > > > > INSERT INTO temp.xact_test (val) VALUES ('a');
> > > > > SELECT relname, n_tup_ins FROM pg_stat_xact_user_tables WHERE
> relname =
> > > > > 'xact_test';
> > > > > COMMIT;
> > > > >
> > > > > -- Transaction 2: insert 1 row, check stats
> > > > > BEGIN;
> > > > > INSERT INTO temp.xact_test (val) VALUES ('b');
> > > > > SELECT relname, n_tup_ins FROM pg_stat_xact_user_tables WHERE
> relname =
> > > > > 'xact_test';
> > > > > ROLLBACK;
> > > > >
> > > > > -- Cleanup
> > > > > DROP TABLE temp.xact_test;
> > > > > "
> > > > >
> > > > > And the output is
> > > > >
> > > > > BEGIN
> > > > > CREATE TABLE
> > > > > INSERT 0 1
> > > > > relname | n_tup_ins
> > > > > -----------+-----------
> > > > > xact_test | 1
> > > > > (1 row)
> > > > >
> > > > > COMMIT
> > > > > BEGIN
> > > > > INSERT 0 1
> > > > > relname | n_tup_ins
> > > > > -----------+-----------
> > > > > xact_test | 2
> > > > > (1 row)
> > > > >
> > > > > ROLLBACK
> > > > > DROP TABLE
> > > >
> > > > I can also reproduce this behavior on HEAD.
> > > >
> > > > > This is not what I would expect from reading
> > > > >
> https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-STATS-VIEWS
> ,
> > > > > where it states
> > > > >
> > > > > > A transaction can also see its own statistics (not yet flushed
> out to the
> > > > > shared memory statistics) in the views pg_stat_xact_all_tables,
> > > > > pg_stat_xact_sys_tables, pg_stat_xact_user_tables, and
> > > > > pg_stat_xact_user_functions. These numbers do not act as stated
> above;
> > > > > instead they update continuously throughout the transaction.
> > > > >
> > > > > based on this, I would expect that the numbers are updated each
> time an
> > > > > insert happens and that they are isolated per transaction.
> > > > >
> > > >
> > > > This snippet of the doc feels somewhat ambiguous. The current
> behavior
> > > > does not seem fully consistent with wording such as “throughout the
> > > > transaction.” It also seems more appropriate for these table views to
> > > > reflect stats within their own scopes, rather than cumulative values
> > > > spanning txn boundaries.
> > > >
> > >
> > > I’ve looked into this issue and 'd like to propose a patch to address
> > > it. Feedback is very welcome.
> > >
> > > --- Root cause
> > >
> > > Since the stats subsystem was rewritten (commit 5891c7a8), each
> > > backend keeps a hash of pending stats entries that persist across
> > > transaction boundaries. Entries that flush successfully are deleted
> > > from the backend-local pending list; entries not flushed remain
> > > pending with their accumulated counters. Flushing is driven by
> > > pgstat_report_stat(), called from the ReadyForQuery path, and subject
> > > to a rate limiter.
> > >
> > > The pg_stat_xact_* views read these pending entries directly via
> > > find_tabstat_entry() / find_funcstat_entry(). Both functions return
> > > the raw accumulated counters without per-transaction scoping.
> > >
> > > The bug is deterministic when multiple top-level transactions are
> > > processed within a single simple-query message, because there is no
> > > ReadyForQuery boundary between the transactions and therefore no
> > > opportunity to flush and remove the pending entry.
> > >
> > > --- Fix
> > >
> > > The patch introduces a per-entry "transaction baseline" — a lazy
> > > snapshot of the counters taken the first time an entry is touched in
> > > each new top-level transaction. The accessor functions
> > > (find_tabstat_entry, find_funcstat_entry) then subtract the baseline
> > > from the current counters, yielding only the current transaction's
> > > delta.
> > >
> > > The baseline is keyed by MyProc->vxid.lxid. A static inline helper,
> > > pgstat_ensure_xact_baseline(), is called at every nontransactional
> > > counter-increment site (the 7 event-counter macros in pgstat.h and the
> > > 4 non-inline counter functions in pgstat_relation.c). After the first
> > > call per entry per transaction, the check reduces to a single integer
> > > comparison with a predictably not-taken branch.
> > >
> > > For function stats, a new PgStat_FunctionPending wrapper struct embeds
> > > PgStat_FunctionCounts at offset 0 with the baseline fields appended,
> > > so the flush callback requires only a trivial cast change.
> > >
> > > --- Testing
> > >
> > > The regression test is a TAP test under src/bin/psql/t/ that uses psql
> > > -c to send multi-statement strings as single simple-query messages.
> > > src/bin/psql/t look like the right existing harness for the primary
> > > regression because psql -c sends the whole multi-statement string as
> > > one simple-query message via simple_action_list dispatch. By contrast,
> > > ordinary pg_regress SQL files are executed by psql in file/stdin mode,
> > > which runs through MainLoop() and dispatches top-level statements one
> > > at a time via SendQuery(), creating ReadyForQuery boundaries that
> > > could mask the bug.
> > >
> > > The test covers three scenarios: table n_tup_ins and seq_scan
> > > counters, and function calls — each verifying isolation across
> > > consecutive top-level transactions within a single message.
> > >
> >
> > postgres % cat output.txt
> > BEGIN
> > CREATE TABLE
> > INSERT 0 1
> > relname | n_tup_ins
> > -----------+-----------
> > xact_test | 1
> > (1 row)
> >
> > COMMIT
> > BEGIN
> > INSERT 0 1
> > relname | n_tup_ins
> > -----------+-----------
> > xact_test | 2
> > (1 row)
> >
> > ROLLBACK
> > DROP TABLE
> >
> > I can also reproduce this behavior in v13. So this issue pre-existed
> > the major refactor 5891c7a8.
> >
>
> After further thought of v1, I'd like to also propose an alternative
> approach. The root cause and testing rationale are the same as
> described upthread; what changes is how the baseline is established.
>
> --- Problems with the lazy (v1) approach
> The v1 patch snapshots the baseline lazily -- on every
> nontransactional counter increment, it checks whether baseline_lxid
> matches the current
> transaction and snapshots if not. This has two drawbacks:
>
> 1) Hot-path overhead
> Every pgstat_count_heap_scan(), pgstat_count_buffer_read_block(), etc.
> acquires a branch. After the first call per entry per transaction, the
> branch is predictably not-taken, but it could still present on some of
> the most frequently executed paths in the backend.
>
> 2) Maintenance burden
> Every new counter added to PgStat_TableCounts requires a corresponding
> baseline snapshot at its increment site. Miss one, and that counter
> silently leaks across transactions
>
> --- eager baseline sweep
> The attached patch records the baseline eagerly at transaction
> boundaries instead of lazily at counter-increment sites.
> pgstat_set_pending_baselines() iterates the pgStatPending list and
> snapshots each entry's current counts into an xact_baseline field via
> struct assignment. It is called from AtEOXact_PgStat() (after folding
> transactional counts and removing dropped entries) and from
> PostPrepare_PgStat() (after relation cleanup), covering commit, abort,
> and PREPARE TRANSACTION. The view accessors unconditionally subtract
> the baseline. For entries created in the current transaction,
> xact_baseline is zero-initialized, so the subtraction is a no-op.
>
> I don’t have a clear preference between the two approaches; both are
> presented for review.
>
> --
> Best,
> Xuneng
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Kieran McCusker 2026-05-06 08:41:03 Re: BUG #19471: Packaging problem with ogr_fdw_18
Previous Message PG Bug reporting form 2026-05-06 07:53:39 BUG #19471: Packaging problem with ogr_fdw_18