Re: [PATCH] Fix pg_rewind false positives caused by shutdown-only WAL

From: BharatDB <bharatdbpg(at)gmail(dot)com>
To: Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Fix pg_rewind false positives caused by shutdown-only WAL
Date: 2025-09-29 10:31:04
Message-ID: CAAh00ERqqAhgA_BJJccwE0BXxUWMk+FHzMoLo1kWcsm+qdNVjw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

*Dear Srinath,*

*Subject:* [PATCH] pg_rewind: Ignore shutdown checkpoints when determining
rewind necessity.

While working with pg_rewind, I noticed that it can sometimes request a
rewind even when no real changes exist after a failover. This happens
because pg_rewind currently determines the end-of-WAL on the target using
the last shutdown checkpoint (or minRecoveryPoint for a standby). In a
clean failover scenario—where a standby is promoted and the old primary is
later shut down—the only WAL record generated after divergence may be a
shutdown checkpoint. Although the data on both nodes is identical, pg_rewind
treats this shutdown record as meaningful and unnecessarily forces a
rewind. The proposed patch fixes this by ignoring shutdown checkpoints (
XLOG_CHECKPOINT_SHUTDOWN) when determining the end-of-WAL, scanning
backward until a non-shutdown record is found. This ensures that rewinds
are triggered only when actual modifications exist after divergence,
avoiding unnecessary rewinds in clean failover situations.

Also, with the proposed fix implemented in my local script, it gives the
following results:

-

Old primary shuts down cleanly.
-

Standby is promoted successfully.
-

pg_rewind correctly detects no rewind is needed.
-

Data on both clusters matches perfectly.

I believe this change will prevent unnecessary rewinds in production
environments, improve reliability, and avoid potential confusion during
failovers.

Thank you for your consideration.

Best regards,
Soumya.

On Sat, Sep 6, 2025 at 10:04 PM Srinath Reddy Sadipiralla <
srinath2133(at)gmail(dot)com> wrote:

> Hi all,
>
> While working with pg_rewind, I noticed that it can sometimes request a
> rewind even when no actual changes exist after a failover.
>
> *Problem:*
> Currently, pg_rewind determines the end-of-WAL on the target by using the
> last shutdown checkpoint (or minRecoveryPoint for a standby). This creates
> a false positive scenario:
>
> 1)Suppose a standby is promoted to become the new primary.
> 2)Later, the old primary is cleanly shut down.
> 3)The only WAL record generated on the old primary after divergence is a
> shutdown checkpoint.
>
> At this point, the old primary and new primary contain identical data.
> However, since the shutdown checkpoint extends the WAL past the divergence
> point, pg_rewind concludes:
>
> if (target_wal_endrec > divergerec)
> rewind_needed = true;
>
> That forces a rewind even though there are no meaningful changes.
>
> To *reproduce this scenario* use the below attached script.
>
> *Fix:*
> The attached patch changes the logic so that pg_rewind no longer treats
> shutdown checkpoints as meaningful records when determining the end-of-WAL.
> Instead, we scan backward from the last checkpoint until we find the most
> recent valid WAL record that is not a shutdown-only related record.
>
> This ensures rewind is only triggered when there are actual modifications
> after divergence, avoiding unnecessary rewinds in clean failover scenarios.
>
>
> --
> Thanks,
> Srinath Reddy Sadipiralla
> EDB: https://www.enterprisedb.com/
>

Attachment Content-Type Size
run_pg_rewind_with_port.sh application/x-shellscript 2.8 KB
0001-Modified-the-condition-to-ignore-shutdown-only-check.patch text/x-patch 2.0 KB
Screenshot from 2025-09-29 15-31-33.png image/png 84.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chao Li 2025-09-29 10:36:27 Re: GB18030-2022 Support in PostgreSQL
Previous Message Greg Burd 2025-09-29 10:27:01 Re: [PATCH] Add tests for Bitmapset