Re: [BUG] Take a long time to reach consistent after pg_rewind

From: cca5507 <cca5507(at)qq(dot)com>
To: surya poondla <suryapoondla4(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [BUG] Take a long time to reach consistent after pg_rewind
Date: 2026-06-16 04:46:18
Message-ID: tencent_4B419BDD4E2854217E6D580A95CC0F1EEB06@qq.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thanks for the comments!

> 1. Commit message understates the fix. It only describes the page-header symptom. The crash-safety property is the stronger argument and the one that resolves the > back-and-forth on which LSN to use. Suggest adding:
> a. "Using the flush LSN is also crash-safe with respect to the source: the insert LSN lives only in shared memory and can be lost on a source crash, leaving the standby's minRecoveryPoint ahead of any LSN the source can subsequently reach."
> 2. Code comment should explain why flush LSN is sufficient. The current "We must replay to the last WAL flush location" doesn't say why. Suggest:
> a. "Use the source's flush LSN as the target's minRecoveryPoint: every WAL-logged page we copied has page-LSN <= source's flush LSN at copy time (WAL-before-data), and flush LSN is monotonic. We avoid the insert LSN because it can sit one page-header past a record's end at segment boundaries (where no record will end), and it is not durable, a source crash can leave flush LSN behind an insert LSN we already pinned."
> 3. Worth a comment in rewind_source.h that the callback must only be invoked against a non-standby source, pg_current_wal_flush_lsn() errors out under recovery.

Fixed.

> 4. No regression test. We can add a regression test under src/bin/pg_rewind/t/.

Currently I don't have a good idea about the test, I will work on it later. Any help is welcome!

--
Regards,
ChangAo Chen

Attachment Content-Type Size
v3-0001-pg_rewind-use-flush-lsn-to-set-minRecoveryPoint.patch application/octet-stream 5.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2026-06-16 04:50:52 Re: Proposal: Conflict log history table for Logical Replication
Previous Message Rahila Syed 2026-06-16 04:42:36 Re: Error while processing invalidation message during ATTACH PARTITION leaves invalid relcache entry