Quick Links

Re: Startup PANIC on standby promotion due to zero-filled WAL segment

From:	Alena Vinter <dlaaren8(at)gmail(dot)com>
To:	Michael Paquier <michael(at)paquier(dot)xyz>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Startup PANIC on standby promotion due to zero-filled WAL segment
Date:	2025-12-23 09:33:30
Message-ID:	CAGWv16JyznsODC8e7T-UuGSOE+6ZM1MjdCCgP1ZVg5iCK7Yh-g@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi Michael,

Thanks for the review. To clarify: TLI 1 does not diverge — it is fully
replicated to the standby before the timeline switch. The test then
intentionally slows down replication on TLI 2 (e.g., by delaying WAL
shipping), reproducing the scenario I illustrated. As far as I’m aware,
`fsync` is `on` by default, and the test does not modify it — so no WAL
records are lost due to unsafe flushing.

The core issue is that the new timeline’s segment is zero-initialized
instead of copying the same segment from the previous timeline (as done in
crash-recovery startup). As a result, startup cannot finish recovery due
to non-replicated end of WAL causing failures like “invalid magic number”.

---
Alena Vinter

In response to

Re: Startup PANIC on standby promotion due to zero-filled WAL segment at 2025-12-23 08:38:13 from Michael Paquier

Responses

Re: Startup PANIC on standby promotion due to zero-filled WAL segment at 2025-12-23 09:47:37 from Michael Paquier

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2025-12-23 09:47:37	Re: Startup PANIC on standby promotion due to zero-filled WAL segment
Previous Message	Chao Li	2025-12-23 09:16:47	Re: Improve documentation of publication privilege checks