Re: BUG #19490: Streaming standby on 16.14 stops applying WAL on MultiXactOffsetSLRU when primary is 16.8

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Ayush Tiwari <ayushtiwari(dot)slg01(at)gmail(dot)com>, Radim Marek <radim(at)boringsql(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #19490: Streaming standby on 16.14 stops applying WAL on MultiXactOffsetSLRU when primary is 16.8
Date: 2026-05-26 18:29:58
Message-ID: 90F2A05B-FEEB-4695-87ED-32F53C6AC097@yandex-team.ru
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

> On 26 May 2026, at 17:28, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>
> looks correct

I tested that change as follows.

Setted up REL_16_0 as primary, REL_16_STABLE as standby.

Generate multixacts in a single session using savepoints:

BEGIN;
SELECT * FROM t WHERE i = 1 FOR NO KEY UPDATE;
-- repeat 2500 times:
SAVEPOINT a; SELECT * FROM t WHERE i = 1 FOR UPDATE; ROLLBACK TO a;
COMMIT;

Each iteration creates a new MultiXactId. 2500 iterations cross the SLRU page
boundary at multixact 2048 with some spare multis (we'll pickle the excess ones in
jars when all is fixed, toying with 2048 wasted dev cycles for no reason).

Test:
0. Run the workload on REL_16_0 primary (2500 multixacts, crossing page 0->1)
1. Take pg_basebackup
2. Run the workload again (2500 more, crossing page 1->2)
3. Start the standby

I observe:
Without the change startup deadlocks.
With the change standby catches up, the DEBUG1 message "next offsets page is not
initialized, initializing it now" confirms the compat block fires correctly.

I packaged this test into a buildfarm module (TestReplayXversion) [0] that
builds REL_x_0 and runs this check on REL_x_STABLE build. It reproduces the deadlock
on 14, 15, and 16; 17 and 18 pass. Currently I'm struggling to inject regress WAL trace
into it, not working so far. On a bright side - I managed to get PR number 42 in buildfarm
client repo.

Best regards, Andrey Borodin.

[0] https://github.com/PGBuildFarm/client-code/pull/42

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2026-05-26 19:00:01 BUG #19494: Error on transaction commit inside pipeline triggers psql's Assert
Previous Message Pierre Forstmann 2026-05-26 17:52:03 Re: BUG #19493: Assertion failure in pg_plan_advice with EXISTS subquery and DO_NOT_SCAN advice