From: | BharatDB <bharatdbpg(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: [PATCH] Fix pg_rewind false positives caused by shutdown-only WAL |
Date: | 2025-10-22 12:59:25 |
Message-ID: | CAAh00ERUVBGfqbGnyYq7nRtvnkhXEbkyx+0GKErKDt122b5orQ@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Oct 1, 2025 at 6:10 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Sep 30, 2025 at 1:24 PM Srinath Reddy Sadipiralla
> <srinath2133(at)gmail(dot)com> wrote:
> > Can you please once confirm this, did you mean that this is not even an
> actual problem to fix or only this patch's logic which I provided does not
> make sense?, because i am trying out come up with another patch based on
> your inputs regarding considering controlfile changes , ignoring
> RUNNING_XACTS records, and to use XLogRecGetRmid test.
>
> Well, the patch's idea is that we can ignore certain WAL records when
> deciding whether pg_rewind is needed. But I do not think we can do
> that, because (1) those WAL records might do important things like
> update the control file and (2) the server will not be OK with
> ignoring those WAL records even if pg_rewind decides that they are not
> important. If you have a plan for working around those two issues,
> please say what your plan is. I don't personally see how it would be
> possible to work around those issues, but of course somebody else
> might have a good idea that has not occurred to me.
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com
Hi all,
With reference to the previous mail, I’d like to submit a small patch for
pg_rewind to fix an issue with false-positive rewinds.
The patch includes the following logic:
- Control file changes: Detects benign shutdown checkpoint differences
in the control file and prevents unnecessary rewinds.
- Other WAL records (RUNNING_XACTS): Ensured that only meaningful WAL
differences trigger rewinds, ignoring records that do not affect server
consistency.
- Server-side consistency: Maintains data integrity while skipping
rewinds for harmless control file changes.
- RMID verification: Confirms that WAL records are examined correctly
using their Resource Manager IDs (RMID) to avoid misinterpreting benign
differences.
I added a simple check in pg_rewind to detect these benign control file
differences. When such a difference is detected, pg_rewind skips the
rewind, prints a log message and no false-positive changes are applied. This
is implemented in the function control_diff_is_benign() and is integrated
into the last checkpoint detection logic.
I tested the logic with a small test script which automatically - initializes
a primary cluster and inserts some test data, creates a standby using
pg_basebackup,
promotes the standby to primary, injects a benign control file change
using pg_resetwal,
runs pg_rewind and verifies that no rewind happens and confirms that data
remains consistent between the clusters.
After testing, I got the output as:test_pg_rewind_fix.sh)
=== 🧮 Test Setup ===
PRIMARY PORT: 50584
STANDBY PORT: 51636
BASE DIR: /home/deepshikha/pg_rewind_test
Cleaning workspace...
Initializing primary cluster...
initdb: warning: enabling "trust" authentication for local connections
initdb: hint: You can change this by editing pg_hba.conf or using the
option -A, or --auth-local and --auth-host, the next time you run initdb.
waiting for server to start.... done
server started
Old primary WAL position: 0/01BA4B80
Creating standby cluster via pg_basebackup...
31374/31374 kB (100%), 1/1 tablespace
waiting for server to start...... done
server started
waiting for server to promote.... done
server promoted
New promoted primary WAL position: 0/03000130
Stopping old primary...
waiting for server to shut down.... done
server stopped
=== WAL Summary ===
Old primary WAL: 0/01BA4B80
New primary WAL: 0/03000130
Injecting benign control file change (pg_resetwal)...
Benign difference introduced.
Running pg_rewind to test fix...
--- pg_rewind output ---
pg_rewind: servers diverged at WAL location 0/03000000 on timeline 1
pg_rewind: benign shutdown checkpoint difference detected, skipping rewind
pg_rewind: error: could not open file
"/home/deepshikha/pg_rewind_test/primary/pg_wal/000000010000000000000003":
No such file or directory
pg_rewind: benign shutdown checkpoint difference detected, skipping rewind
pg_rewind: rewinding from last common checkpoint at 0/000006F0 on timeline
2796767292
pg_rewind: error: could not open file
"/home/deepshikha/pg_rewind_test/primary/pg_wal/000000010000000000000000":
No such file or directory
pg_rewind: error: could not read WAL record at 0/000006F0
------------------------
waiting for server to start.... done
server started
=== Data check ===
Primary data:
id | data
----+---------
1 | primary
2 | wal1
3 | wal2
(3 rows)
New primary data:
id | data
----+---------
1 | primary
2 | wal1
3 | wal2
(3 rows)
PASS: Benign control file difference correctly detected. No false-positive
rewind.
=== Test completed ===
Note => PASS: Benign control file difference correctly detected. No
false-positive rewind.
This confirms that the patch works as expected, preventing unnecessary
rewinds when the only difference between the old primary and the new
primary is a benign shutdown checkpoint change.
I Kindly request you to review the patch and please let me know if any
additional details need to be focused on.
Thanking you.
Regards,
Soumya
Attachment | Content-Type | Size |
---|---|---|
0001-pg_rewind-Skip-false-positive-rewind-on-benign-shutd.patch | text/x-patch | 9.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Akshay Joshi | 2025-10-22 13:19:10 | Re: [PATCH] Add pg_get_policy_ddl() function to reconstruct CREATE POLICY statement |
Previous Message | Andres Freund | 2025-10-22 12:59:02 | Re: Stack-based tracking of per-node WAL/buffer usage |