Re: Return value of XLogInsertRecord() for XLOG_SWITCH record

From: 반지현 <rring0727(at)gmail(dot)com>
To: ZizhuanLiu X-MAN <44973863(at)qq(dot)com>, cca5507 <cca5507(at)qq(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Return value of XLogInsertRecord() for XLOG_SWITCH record
Date: 2026-06-11 06:53:19
Message-ID: 3CAC439F-E1F0-4F37-BE58-CC9D7CDE889C@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi ZizhuanLiu,

Thanks for the caller survey — classifying the call sites by how they
consume the return value is a useful way to frame the risk, and I agree
pg_switch_wal() is the one path where the LSN escapes to external
consumers.

To help bound that risk, I re-ran my April comparison on a second
platform (Windows 11, MSYS2/gcc 16.1.0, commit 9d141466ff, 19beta1),
building unpatched master and v2 side by side and running the same
sequence on identically-initialized clusters:

before_switch: 0/017CF958 (both builds)
switch_1: 0/017CF970 (both builds)
after_1: 0/02000000 (both builds)
switch_2/3: 0/02000000, no-op (both builds)

The two builds returned byte-for-byte identical LSNs at every step,
and "make check" passes on the patched build (All 245 tests passed).
Incidentally this also confirms numerically that the MAXALIGN() change
is a no-op: 0/017CF958 + 24 = 0/017CF970, since SizeOfXLogRecord is
already 8-byte aligned.

As far as I can tell, the only input where v2 can return a different
value than current master is when the XLOG_SWITCH record ends exactly
on a page boundary (StartPos at page offset XLOG_BLCKSZ - 24): old code
takes the cross-page branch and adds a page header size even though no
part of the record lies on the next page, while v2 returns the boundary
itself. The v2 value is the one consistent with the end-pointer
convention of XLogBytePosToEndRecPtr(). So any external tool that
observed a different value in that rare alignment was depending on a
value inconsistent with PostgreSQL's own end-pointer semantics — I'd
read v2 as correcting that, rather than breaking compatibility.

So in summary: identical SQL-visible behavior on all common paths
(verified on Linux in April and Windows now), with the single divergent
case being a consistency fix. If it would help, I could try to craft a
reproduction of the exact page-boundary case (padding WAL position via
pg_logical_emit_message), or write a small TAP test pinning down the
boundary behavior of pg_switch_wal().

Regards,
Jihyun Bahn

Attachment Content-Type Size
comparison-results.txt.txt text/plain 3.1 KB
unknown_filename text/plain 3.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ewan Young 2026-06-11 07:03:54 [PATCH] seg: preserve the upper boundary's certainty indicator in seg_out()
Previous Message Hayato Kuroda (Fujitsu) 2026-06-11 06:52:07 RE: [PATCH] Preserve replication origin OIDs in pg_upgrade