| From: | Nikolay Samokhvalov <nik(at)postgres(dot)ai> |
|---|---|
| To: | pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | recovery.signal not cleaned up when both signal files are present |
| Date: | 2026-02-06 20:41:32 |
| Message-ID: | CAM527d8PVAQFLt_ndTXE19F-XpDZui861882L0rLY3YihQB8qA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi hackers,
I observed a case when users who used "pgbackrest restore", not using
"--type=standby", which means that pgBackRest placed recover.signal,
and since they wanted this node to be a standby, then manually placed
standby.signal too, and configured primary_conninfo.
Postgres allows both recovery.signal and standby.signal to coexist –
no complaints, it starts, and gives standby.signal a precedence.
However, this might lead to a latent problem: imagine a standby gots
promoted and then goes through a subsequent failover cycle. In this
case, the orphaned recovery.signal causes the node to perform an
unexpected PITR recovery and self-promote to a new timeline instead of
remaining a standby. Which surprised the user a lot.
Exact sequence that leads to trouble (Reproduced on PostgreSQL 17.7
with pgBackRest 2.58.0):
1. Restore a backup (pgBackRest default creates `recovery.signal`)
2. Add `standby.signal` and `primary_conninfo` for streaming replication
3. Start as standby — works fine (`standby.signal` takes precedence)
4. Promote this standby to primary (e.g., switchover) —
`standby.signal` is removed, `recovery.signal` is NOT
5. Node runs as primary with `recovery.signal` still on disk
6. Node crashes or is stopped
7. pg_rewind + add `standby.signal` to rejoin as standby
8. Start — works as standby again, `recovery.signal` still present
9. Promote again (e.g., failback) — `standby.signal` removed,
`recovery.signal` still NOT removed
10. If the node later needs to rejoin as standby via pg_rewind
(without `standby.signal` yet), it finds `recovery.signal`,
performs PITR recovery, and self-promotes to a new timeline
I spent some time to understand this, and found in xlogrecovery.c:
if (stat(STANDBY_SIGNAL_FILE, &stat_buf) == 0)
{
/* ... */
standby_signal_file_found = true;
}
else if (stat(RECOVERY_SIGNAL_FILE, &stat_buf) == 0)
{
/* ... */
recovery_signal_file_found = true;
}
-- so the recovery.signal is not registered, Postgres doesn't know it exists.
Cleanup logic for both files in xlog.c looks independent:
if (endOfRecoveryInfo->standby_signal_file_found)
durable_unlink(STANDBY_SIGNAL_FILE, FATAL);
if (endOfRecoveryInfo->recovery_signal_file_found)
durable_unlink(RECOVERY_SIGNAL_FILE, FATAL);
-- but it cleans up only what it knows. So, recovery.signal is not cleaned.
Concerns/questions:
1. I don't like the fact that recovery_signal_file_found is set to
false although the file is present -- this is hard to read and
troubleshoot...
2. The comment in xlog.c says "The comment there even says "Remove the
signal files out of the way, so that we don't accidentally re-enter
archive recovery mode in a subsequent crash" -- but `recovery.signal`
escapes this cleanup. Looks like what's happening was not expected by
design, is it correct conclusion?
3. It seems to me that having both files coexist is always a
misconfiguration -- there
is no use case where a node should be in both PITR and standby mode.
If it is so, maybe we should:
- at minimum, remove the orphaned recovery.signal when
standby.signal takes precedence (or at end of recovery)
- do not start if both files are present: consider it abnormal and
ask for explicit cleanup, so user (or tooling) could decide which file
needs to stay
thoughts?
Nik
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andreas Karlsson | 2026-02-06 21:03:15 | Re: doc: add note that wal_level=logical doesn't set up logical replication in itself |
| Previous Message | Tom Lane | 2026-02-06 20:31:03 | Re: [PING] fallocate() causes btrfs to never compress postgresql files |