Stack-based tracking of per-node WAL/buffer usage

From: Lukas Fittl <lukas(at)fittl(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>
Subject: Stack-based tracking of per-node WAL/buffer usage
Date: 2025-08-31 23:57:01
Message-ID: CAP53PkzdBK8VJ1fS4AZ481LgMN8f9mJiC39ZRHqkFUSYq6KWmg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Please find attached a patch series that introduces a new paradigm for how
per-node WAL/buffer usage is tracked, with two primary goals: (1) reduce
overhead of EXPLAIN ANALYZE, (2) enable future work like tracking estimated
distinct buffer hits [0].

Currently we utilize pgWalUsage/pgBufferUsage as global counters, and in
InstrStopNode we call the rather
expensive BufferUsageAccumDiff/WalUsageAccumDiff to know how much activity
happened within a given node cycle.

This proposal instead uses a stack, where each time we enter a node
(InstrStartNode) we point a new global (pgInstrStack) to the current stack
entry. Whilst we're in that node we increment buffer/WAL usage statistics
to the stack entry. On exit (InstrStopNode) we restore the previous entry.

This change provides about a 10% performance benefit for EXPLAIN ANALYZE on
paths that repeatedly enter InstrStopNode, e.g. SELECT COUNT(*):

CREATE TABLE test(id int);
INSERT INTO test SELECT * FROM generate_series(0, 1000000);

master (124ms, best out of 3):

postgres=# EXPLAIN (ANALYZE) SELECT COUNT(*) FROM test;
QUERY PLAN

------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=16925.01..16925.02 rows=1 width=8) (actual
time=124.910..124.910 rows=1.00 loops=1)
Buffers: shared hit=752 read=3673
-> Seq Scan on test (cost=0.00..14425.01 rows=1000001 width=0) (actual
time=0.201..62.228 rows=1000001.00 loops=1)
Buffers: shared hit=752 read=3673
Planning Time: 0.116 ms
Execution Time: 124.961 ms

patched (109ms, best out of 3):

postgres=# EXPLAIN (ANALYZE) SELECT COUNT(*) FROM test;
QUERY PLAN

------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=16925.01..16925.02 rows=1 width=8) (actual
time=109.788..109.788 rows=1.00 loops=1)
Buffers: shared hit=940 read=3485
-> Seq Scan on test (cost=0.00..14425.01 rows=1000001 width=0) (actual
time=0.153..69.368 rows=1000001.00 loops=1)
Buffers: shared hit=940 read=3485
Planning Time: 0.134 ms
Execution Time: 109.837 ms
(6 rows)

I have also prototyped a more ambitious approach that completely removes
pgWalUsage/pgBufferUsage (utilizing the stack-collected data for e.g.
pg_stat_statements), but for now this patch set does not include that
change, but instead keeps adding to these legacy globals as well.

Patches attached:

0001: Separate node instrumentation from other use of Instrumentation struct

Previously different places (e.g. query "total time") were repurposing
the per-node Instrumentation struct. Instead, simplify the Instrumentation
struct to only track time, WAL/buffer usage, and tuple counts. Similarly,
drop the use of InstrEndLoop outside of per-node instrumentation. Introduce
the NodeInstrumentation struct to carry forward the per-node
instrumentation information.

0002: Replace direct changes of pgBufferUsage/pgWalUsage with INSTR_* macros

0003: Introduce stack for tracking per-node WAL/buffer usage

Feedback/thoughts welcome!

CCing Andres since he had expressed interest in this off-list.

[0]: See lightning talk slides from PGConf.Dev discussing an HLL-based
EXPLAIN (BUFFERS DISTINCT):
https://resources.pganalyze.com/pganalyze_PGConf.dev_2025_shared_blks_hit_distinct.pdf

Thanks,
Lukas

--
Lukas Fittl

Attachment Content-Type Size
v1-0002-Replace-direct-changes-of-pgBufferUsage-pgWalUsag.patch application/octet-stream 9.0 KB
v1-0001-Separate-node-instrumentation-from-other-use-of-I.patch application/octet-stream 21.8 KB
v1-0003-Introduce-stack-for-tracking-per-node-WAL-buffer-.patch application/octet-stream 12.3 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-09-01 00:05:52 Re: Invalid remote sampling test in postgres_fdw.
Previous Message Atsushi Torikoshi 2025-08-31 22:58:21 Re: COPY TO: provide hint when WHERE clause is used