Re: [PATCH] Fix PITR pause bypass when initial XLOG_RUNNING_XACTS has subxid overflow

From: Jan Nidzwetzki <jan(at)planetscale(dot)com>
To: Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>
Cc: Matt Blewitt <mble(at)planetscale(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PATCH] Fix PITR pause bypass when initial XLOG_RUNNING_XACTS has subxid overflow
Date: 2026-06-16 08:58:52
Message-ID: BED7F79C-2D2A-49E6-909A-83A4ECEB8C9F@planetscale.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Zsolt,

Thank you very much for pointing out the problem and the TAP test to reproduce it. I missed that PostgreSQL can change data in recovery mode when the database is not using checksums and the server is running without 'wal_log_hints'. Rather than trying to make that path safe, I think the conservative fix is to log a message and shut down when an incomplete snapshot is present at the end of recovery with 'recovery_target_action = pause'.

The attached patch does that: when hot standby is not active at the recovery target (e.g., due to an incomplete snapshot), PostgreSQL will log a message and shut down instead of promoting silently. It mirrors how 'pause' is already downgraded to 'shutdown' when hot_standby is off. This lets the user choose a different recovery target or action. The patch also updates the documentation to clarify the behavior and adds a TAP test to verify the change.

Best regards
Jan

Attachment Content-Type Size
0001-Shut-down-instead-of-promoting-when-recovery-cannot-.patch application/octet-stream 9.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Previous Message Zhijie Hou (Fujitsu) 2026-06-16 08:54:11 RE: Fix race in ReplicationSlotRelease for ephemeral slots