回复:Bug in MultiXact replay compat logic for older minor version after crash-recovery

From: 段坤仁(刻韧) <duankunren(dot)dkr(at)alibaba-inc(dot)com>
To: "Heikki Linnakangas" <hlinnaka(at)iki(dot)fi>
Cc: "pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>, "x4mmm" <x4mmm(at)yandex-team(dot)ru>
Subject: 回复:Bug in MultiXact replay compat logic for older minor version after crash-recovery
Date: 2026-03-22 13:09:05
Message-ID: 76f70088-e2ee-4b17-9e12-fa89f2a08393.duankunren.dkr@alibaba-inc.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the v2 patch.

On 20/03/2026 16:19, Heikki Linnakangas wrote:
> it means that tracking the latest page we have zeroed is not merely
> an optimization to avoid excessive SimpleLruDoesPhysicalPageExist()
> calls, it's needed for correctness.

Agreed.

On 20/03/2026 18:14, Heikki Linnakangas wrote:
> I also added another safety measure: before calling
> SimpleLruDoesPhysicalPageExist(), flush all the SLRU buffers.

This is more robust than scanning the SLRU buffers first and only
calling SimpleLruDoesPhysicalPageExist() on a miss, which would
rely on the SLRU eviction invariant.

I walked through the scenarios I could think of. Let N be the last
multixid on offset page P, so N+1 falls on page P+1.

(a) Old-version WAL (CREATE_ID:N before ZERO_OFF_PAGE:P+1):
last_initialized_offsets_page = P from earlier ZERO_OFF_PAGE.
init_needed = (P == P) = true -> init P+1. Correct.
Later ZERO_OFF_PAGE:P+1 is skipped via pre_initialized_offsets_page.

(b) Crash-restart, page P+1 not on disk (the original bug):
last_initialized_offsets_page = -1, fallback path fires.
SimpleLruDoesPhysicalPageExist(P+1) = false -> init. Correct.

(c) Crash-restart, page P+1 already on disk:
Same fallback, SimpleLruDoesPhysicalPageExist(P+1) = true -> skip.
last_initialized_offsets_page stays -1 until the next
ZERO_OFF_PAGE switches back to the fast path.

(d) Out-of-order CREATE_IDs (ZERO_PAGE:P+1 -> CREATE_ID:N+1 ->
CREATE_ID:N+2 -> CREATE_ID:N):
N+1 and N+2 don't cross a page boundary, compat logic not entered.
CREATE_ID:N: init_needed = (P+1 == P) = false -> skip.
Page P+1 is not re-zeroed, data from N+1/N+2 preserved.

(e) Consecutive page crossings (N on page P, later M on page P+1):
After init of P+1: last_initialized_offsets_page = P+1.
CREATE_ID:M: init_needed = (P+1 == P+1) = true -> init P+2.
Tracking advances monotonically across page boundaries.

The logic looks correct to me in all the cases above.

Regards,
Duan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kirill Reshke 2026-03-22 14:15:38 Re: Bug in MultiXact replay compat logic for older minor version after crash-recovery
Previous Message jian he 2026-03-22 12:58:55 Re: Copy from JSON FORMAT.