Re: [PATCH] Fix pg_rewind false positives caused by shutdown-only WAL

From: BharatDB <bharatdbpg(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Fix pg_rewind false positives caused by shutdown-only WAL
Date: 2025-10-22 12:59:25
Message-ID: CAAh00ERUVBGfqbGnyYq7nRtvnkhXEbkyx+0GKErKDt122b5orQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 1, 2025 at 6:10 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Tue, Sep 30, 2025 at 1:24 PM Srinath Reddy Sadipiralla
> <srinath2133(at)gmail(dot)com> wrote:
> > Can you please once confirm this, did you mean that this is not even an
> actual problem to fix or only this patch's logic which I provided does not
> make sense?, because i am trying out come up with another patch based on
> your inputs regarding considering controlfile changes , ignoring
> RUNNING_XACTS records, and to use XLogRecGetRmid test.
>
> Well, the patch's idea is that we can ignore certain WAL records when
> deciding whether pg_rewind is needed. But I do not think we can do
> that, because (1) those WAL records might do important things like
> update the control file and (2) the server will not be OK with
> ignoring those WAL records even if pg_rewind decides that they are not
> important. If you have a plan for working around those two issues,
> please say what your plan is. I don't personally see how it would be
> possible to work around those issues, but of course somebody else
> might have a good idea that has not occurred to me.
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com

Hi all,
With reference to the previous mail, I’d like to submit a small patch for
pg_rewind to fix an issue with false-positive rewinds.
The patch includes the following logic:

- Control file changes: Detects benign shutdown checkpoint differences
in the control file and prevents unnecessary rewinds.
- Other WAL records (RUNNING_XACTS): Ensured that only meaningful WAL
differences trigger rewinds, ignoring records that do not affect server
consistency.
- Server-side consistency: Maintains data integrity while skipping
rewinds for harmless control file changes.
- RMID verification: Confirms that WAL records are examined correctly
using their Resource Manager IDs (RMID) to avoid misinterpreting benign
differences.

I added a simple check in pg_rewind to detect these benign control file
differences. When such a difference is detected, pg_rewind skips the
rewind, prints a log message and no false-positive changes are applied. This
is implemented in the function control_diff_is_benign() and is integrated
into the last checkpoint detection logic.
I tested the logic with a small test script which automatically - initializes
a primary cluster and inserts some test data, creates a standby using
pg_basebackup,
promotes the standby to primary, injects a benign control file change
using pg_resetwal,
runs pg_rewind and verifies that no rewind happens and confirms that data
remains consistent between the clusters.

After testing, I got the output as:test_pg_rewind_fix.sh)

=== 🧮 Test Setup ===
PRIMARY PORT: 50584
STANDBY PORT: 51636
BASE DIR: /home/deepshikha/pg_rewind_test

Cleaning workspace...
Initializing primary cluster...
initdb: warning: enabling "trust" authentication for local connections
initdb: hint: You can change this by editing pg_hba.conf or using the
option -A, or --auth-local and --auth-host, the next time you run initdb.
waiting for server to start.... done
server started
Old primary WAL position: 0/01BA4B80
Creating standby cluster via pg_basebackup...
31374/31374 kB (100%), 1/1 tablespace
waiting for server to start...... done
server started
waiting for server to promote.... done
server promoted
New promoted primary WAL position: 0/03000130
Stopping old primary...
waiting for server to shut down.... done
server stopped
=== WAL Summary ===
Old primary WAL: 0/01BA4B80
New primary WAL: 0/03000130

Injecting benign control file change (pg_resetwal)...
Benign difference introduced.

Running pg_rewind to test fix...
--- pg_rewind output ---
pg_rewind: servers diverged at WAL location 0/03000000 on timeline 1
pg_rewind: benign shutdown checkpoint difference detected, skipping rewind
pg_rewind: error: could not open file
"/home/deepshikha/pg_rewind_test/primary/pg_wal/000000010000000000000003":
No such file or directory
pg_rewind: benign shutdown checkpoint difference detected, skipping rewind
pg_rewind: rewinding from last common checkpoint at 0/000006F0 on timeline
2796767292
pg_rewind: error: could not open file
"/home/deepshikha/pg_rewind_test/primary/pg_wal/000000010000000000000000":
No such file or directory
pg_rewind: error: could not read WAL record at 0/000006F0
------------------------

waiting for server to start.... done
server started
=== Data check ===
Primary data:
id | data
----+---------
1 | primary
2 | wal1
3 | wal2
(3 rows)

New primary data:
id | data
----+---------
1 | primary
2 | wal1
3 | wal2
(3 rows)

PASS: Benign control file difference correctly detected. No false-positive
rewind.

=== Test completed ===

Note => PASS: Benign control file difference correctly detected. No
false-positive rewind.

This confirms that the patch works as expected, preventing unnecessary
rewinds when the only difference between the old primary and the new
primary is a benign shutdown checkpoint change.

I Kindly request you to review the patch and please let me know if any
additional details need to be focused on.

Thanking you.

Regards,

Soumya

Attachment Content-Type Size
0001-pg_rewind-Skip-false-positive-rewind-on-benign-shutd.patch text/x-patch 9.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Akshay Joshi 2025-10-22 13:19:10 Re: [PATCH] Add pg_get_policy_ddl() function to reconstruct CREATE POLICY statement
Previous Message Andres Freund 2025-10-22 12:59:02 Re: Stack-based tracking of per-node WAL/buffer usage