| From: | Lukas Fittl <lukas(at)fittl(dot)com> |
|---|---|
| To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de> |
| Cc: | Peter Smith <smithpb2250(at)gmail(dot)com> |
| Subject: | Re: Stack-based tracking of per-node WAL/buffer usage |
| Date: | 2026-03-08 04:27:38 |
| Message-ID: | CAP53PkxrmpECzVFpeeEEHDGe6u625s+YkmVv5-gw3L_NDSfbiA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Mon, Feb 23, 2026 at 8:18 PM Lukas Fittl <lukas(at)fittl(dot)com> wrote:
> 0001 addresses the issue that Peter raised and corrects the macro name
> + adds a missing comment.
This has been pushed by Andres, thanks again Peter for noting this!
See attached v7, with the changes as noted later. But first, some
fresh performance numbers that take into account extra inlining added
in a new v7/0006:
Example (default shared_buffers, runtimes are best out of 3-ish):
CREATE TABLE lotsarows(key int not null);
INSERT INTO lotsarows SELECT generate_series(1, 50000000);
VACUUM FREEZE lotsarows;
250ms actual runtime (no instrumentation)
BUFFERS OFF, TIMING OFF:
295ms master
295ms with stack-based instrumentation only (v7/0005) -- no change
because BUFFERS OFF
260ms with ExecProcNodeInstr inlining work (v7/0006)
BUFFERS ON, TIMING OFF:
380ms master
305ms with stack-based instrumentation only (v7/0005)
280ms with ExecProcNodeInstr inlining work (v7/0006)
In summary: For BUFFERS ON, we're going from 52% overhead in this
stress test, to 12% overhead (22% without the ExecProcNodeInstr
change). With rows instrumentation only, we go from 18% to 3%
overhead.
> 0002/0003 are the same as in v5 and are preparatory commits that
> should be in good shape to go.
I split out the trigger instrumentation refactoring into its own
commit (0001) per Andres' request off-list.
0002 is the node instrumentation refactoring, which changed slightly,
because Instrumentation is now treated as a base struct and
NodeInstrumentation now contains an Instrumentation "instr" field to
reduce duplication.
0003 replaces pgBufferUsage/pgWalUsage with INSTR_* macros, pretty
much unchanged from v6/0002.
> 0004 adds new regression tests that pass on both master and the
> patched version, and stress test some of the nested instrumentation
> handling.
This patch is the same in v7 as in v6.
> 0005 is what was previously 0003, and introduces the stack-based
> instrumentation mechanism.
0005 is still the stack-based instrumentation commit, but has several
improvements over v6:
- Andres suggested a different stack structure in a conversation
off-list, where we track the stack entries in a separately kept array
instead of with inline pointers on the Instrumentation structure -
that ends up making the whole change a lot easier to reason about, and
clarifies the abort handling as well
- I've separated out the ResOwner handling into a new
"QueryInstrumentation" struct - this makes it clearer that one can use
Instrumentation directly with certain caveats (i.e. one must deal with
aborts directly), and feels better now that we're including
Instrumentation in NodeInstrumentation as well
> 0006 could probably be merged into 0005, and is the existing
> mechanical change to drop the pgBufferUsage global, but kept separate
> for easier review for now.
I've merged that together now into 0005 because it feels better to
reason about this as a whole and drop pgBufferUsage within the same
commit where we're adding the stack-based instrumentation.
There is now a new 0006 patch that further optimizes ExecProcNodeInstr
(with performance numbers as noted in the beginning of this mail).
During my testing I realized that the conditional checks in
InstrStartNode/InstrStopNode are quite unnecessary - we never change
what gets instrumented during execution, and so we can create
specialized functions for the different instrumentation combinations,
giving the compiler a much better chance at generating optimal
instructions. The assembly here looks a lot better now.
> 0007 is the stack-mechanism for splitting out "Table Buffers" for
> Index Scans.
This stayed pretty much the same from the earlier version, but I
reworked this a bit to avoid special Instr functions for dealing with
this case, instead it re-uses NodeInstrumentation and
InstrStartNode/InstrStopNode.
> 0008 adds a pg_session_buffer_usage() module for testing the global
> counters. This helps to verify the handling of aborts behave sanely
> (and can run on both master and patched), but I don't think we should
> commit this.
This is identical with v6/0008, but continues to prove very useful in
testing the refactorings.
> Two questions on my mind:
>
> 1) Should we call this "instrumentation context" (InstrContext?)
> instead of "instrumentation stack" (InstrStack)? When writing comments
> I found the difference between "stack entry" and "stack" confusing at
> times, and "context" feels a bit more clear as a term (e.g. as in
> "CurrentInstrContext" or "CurrentInstrumentationContext").
With the refactoring of the stack-mechanism done now, I think we don't
need to change the naming here, since "InstrStack" as a commonly used
struct is gone now (its just Instrumentation now), so I'd suggest
keeping the "stack" terminology to describe the approach itself.
> 2) For 0007, "Table Buffers" feels a bit inconsistent with "Heap
> Fetches" and "Heap Blocks" used elsewhere, should we potentially use
> "Heap Buffers", or change the existing names to "Table ..." instead?
That is still an open question.
Thanks,
Lukas
--
Lukas Fittl
| Attachment | Content-Type | Size |
|---|---|---|
| v7-0004-instrumentation-Add-additional-regression-tests-c.patch | application/octet-stream | 23.5 KB |
| v7-0002-instrumentation-Separate-per-node-logic-from-othe.patch | application/octet-stream | 26.3 KB |
| v7-0001-instrumentation-Separate-trigger-logic-from-other.patch | application/octet-stream | 9.7 KB |
| v7-0003-instrumentation-Replace-direct-changes-of-pgBuffe.patch | application/octet-stream | 9.9 KB |
| v7-0005-Optimize-measuring-WAL-buffer-usage-through-stack.patch | application/octet-stream | 66.3 KB |
| v7-0006-instrumentation-Optimize-ExecProcNodeInstr-instru.patch | application/octet-stream | 11.6 KB |
| v7-0007-Index-scans-Show-table-buffer-accesses-separately.patch | application/octet-stream | 16.8 KB |
| v7-0008-Add-pg_session_buffer_usage-contrib-module.patch | application/octet-stream | 25.5 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Lukas Fittl | 2026-03-08 04:31:07 | Re: Stack-based tracking of per-node WAL/buffer usage |
| Previous Message | Tatsuo Ishii | 2026-03-07 23:47:22 | Re: Row pattern recognition |