Re: BUG #19396: Standby and DR site replication broken with PANIC: WAL contains references to invalid pages messge

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: ishanjoshi(at)live(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #19396: Standby and DR site replication broken with PANIC: WAL contains references to invalid pages messge
Date: 2026-02-10 00:34:35
Message-ID: aYp9G02_B-sJdO6X@paquier.xyz
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Feb 09, 2026 at 07:31:13AM +0000, PG Bug reporting form wrote:
> Primary db was not impacted, however standby node and DR site replication
> broken, I tried to reinit with latest backup + archive loading from
> pgbackrest backup but it fails with same error once the corrupt wal/archive
> file applying the changes. I had to reinit with pgbasebackup with 40TB
> database which took about 45 hrs of time.
>
> Looks like some RACE condition happend to WAL file that generate the issue.
> looks like potential bug of it.

Perhaps so. However, it is basically impossible to determine if this
is actually an issue without more information. Hence, one would need
more input about the workloads involved (concurrency included), the
pages touched, and the WAL patterns at least. The best thing possible
would be a reproducible self-contained test case, of course, which
could be used to evaluate the versions impacted and the potential
solutions. Race conditions like that with predefined WAL patterns
should be easy to reproduce with some injection points to force a
strict ordering of WAL record, particularly if this is a problem that
can be reproduced after a startup, where we just need to make sure
that a node is able to recover.

One thing that may matter, on top of my mind: does your backup setup
rely on the in-core incremental backups with some combined backups?
That could be a contributing factor, or not.
--
Michael

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2026-02-10 00:37:09 Re: BUG #19393: pg_upgrade fails with duplicate key violation when CHECK constraint named *_not_null exists
Previous Message surya poondla 2026-02-09 23:51:10 Re: Possibly a bug