Re: Proposal: Prevent Primary/Standby SLRU divergence during MultiXact truncation

From: Ayush Tiwari <ayushtiwari(dot)slg01(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proposal: Prevent Primary/Standby SLRU divergence during MultiXact truncation
Date: 2026-03-17 15:07:26
Message-ID: CAJTYsWV1qSPVYzfZn0OXD661xReuKvVT0ccPuRb1PDv6MVjRPw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thank you for the response.

On Tue, 17 Mar 2026 at 03:40, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:

>
> Replaying the record will perform the same sanity checks against
> wraparound as the primary does.
>
> Hmm, although why did I not apply commit 817f74600d to 'master', only
> backbranches? The bug that it fixed was related to minor version
> upgrade, and thus it was not needed on 'master', but the code change
> would nevertheless make a lot of sense on 'master' too.
>

Agreed, once 817f74600d is on master the standby would honestly evaluate
the SimpleLruTruncate wraparound backstop instead of bypassing it.

However, the backstop is documented as catching "wraparound bugs elsewhere
in SLRU handling." If such a bug corrupts latest_page_number on the
primary, the standby — which derives its latest_page_number independently
from ZERO_OFF_PAGE replay and StartupMultiXact() — would not share the same
corruption. The primary would skip the truncation, but the standby would
see a healthy latest_page_number and proceed.

> Have you been able to reproduce that?
>

I have reproduced the primary-side condition on an unmodified tree using
gdb in batch mode: attach to the VACUUM backend after
WriteMTruncateXlogRec() returns, corrupt latest_page_number, and resume.
The primary logs "apparent wraparound" and skips the physical deletion,
while pg_waldump confirms the TRUNCATE_ID record is present in the WAL. I
have not yet set up a streaming replica to demonstrate end-to-end
divergence and promotion failure.

>
> I agree that would probably be better. I'm not sure how straightforward
> it will be to implement though, I wouldn't want to add much extra code
> just for this.
>

One approach that might keep the footprint small: we could inline the same
PagePrecedes check that SimpleLruTruncate uses directly in
TruncateMultiXact(), before START_CRIT_SECTION(). Something like:

if (MultiXactOffsetCtl->PagePrecedes(
pg_atomic_read_u64(&MultiXactOffsetCtl->shared->latest_page_number),
MultiXactIdToOffsetPage(PreviousMultiXactId(newOldestMulti))) ||
MultiXactMemberCtl->PagePrecedes(
pg_atomic_read_u64(&MultiXactMemberCtl->shared->latest_page_number),
MXOffsetToMemberPage(newOldestOffset)))
{
ereport(LOG,
(errmsg("skipping multixact truncation due to apparent
wraparound")));
LWLockRelease(MultiXactTruncationLock);
return;
}

No new functions, no changes to slru.c or the replay path — just the same
condition evaluated earlier so we never enter the critical section or write
WAL for a truncation that won't be carried out. Does this seem like a
reasonable direction?

Regards,
Ayush

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hüseyin Demir 2026-03-17 15:18:11 Re: Improve checks for GUC recovery_target_xid
Previous Message Tom Lane 2026-03-17 15:05:04 Re: Change copyObject() to use typeof_unqual