Re: Timeline switching with partial WAL records can break replica recovery

From: Alena Vinter <dlaaren8(at)gmail(dot)com>
To: Artem Gavrilov <artem(dot)gavrilov(at)percona(dot)com>
Cc: Nataliia <k(dot)natalissa(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Timeline switching with partial WAL records can break replica recovery
Date: 2025-12-26 07:09:08
Message-ID: CAGWv16K=SkY8Y+jCNVdf7jot13YHK6opeWQAwyB+RDzgKKP=hg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Artem!

Thank you for the clarification about archiving. I now fully understand why
writing a missing contrecord into an already-archived timeline is unsafe.

Could this be avoided by having the standby check the WAL archive before
promotion? Specifically, if the standby detects an incomplete contrecord at
the end of its WAL stream, it attempts to fetch the contrecord from the
archive, and only if the contrecord is not found in the archive, it
proceeds with writing a missing contrecord and starting a new timeline.
What do you think?

I plan to reproduce your described scenario to test both my original patch
and this revised approach.

P.S. I'm attaching my notes just so I don’t lose them =)

---
Alena Vinter

Attachment Content-Type Size
contrecord_from_archive.jpg image/jpeg 2.1 MB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shinya Kato 2025-12-26 07:30:59 Re: remove unnecessary include in src/backend/commands/policy.c
Previous Message Chao Li 2025-12-26 06:41:01 Re: Sequence Access Methods, round two