Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: David Steele <david(at)pgmasters(dot)net>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, zxwsbg12138(at)gmail(dot)com, david(dot)zhang(at)highgo(dot)ca, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
Date: 2023-10-31 12:28:07
Message-ID: CA+TgmoYmWhN4u4zxmdYciaHJZdqkGKzpn==sbSOeRP8PJFSvoA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 30, 2023 at 8:40 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> As far as I know, there's one paragraph in the docs that implies this
> mode without giving an actual hint that this may be OK or not, so
> shrug:
> https://www.postgresql.org/docs/devel/continuous-archiving.html#BACKUP-TIPS
> "As with base backups, the easiest way to produce a standalone hot
> backup is to use the pg_basebackup tool. If you include the -X
> parameter when calling it, all the write-ahead log required to use the
> backup will be included in the backup automatically, and no special
> action is required to restore the backup."

I see your point, but that's way too subtle. As far as I know, the
only actually-documented procedure for restoring is this one:

https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-PITR-RECOVERY

That procedure actually is badly in need of some updating, IMHO,
because close to half of it is about moving your existing database
cluster out of the way, which may or may not be needed in the case of
any particular backup restore. Also, it unconditionally mentions
creating recovery.signal, with no mention of standby.signal. And
certainly not with neither. It also gives zero motivation for actually
doing this and says nothing useful about backup_label.

Both recovery.signal and standby.signal are documented in
https://www.postgresql.org/docs/current/runtime-config-wal.html#RUNTIME-CONFIG-WAL-ARCHIVE-RECOVERY
but you'd have no real reason to look in a list of GUCs for
information about a file on disk. recovery.signal but not
standby.signal is mentioned in
https://www.postgresql.org/docs/current/warm-standby.html but nowhere
that I can find do we explicitly talk about running with at least one
of them.

> As you're telling me, and I've considered that as an option as well,
> perhaps we should just consider the presence of a backup_label file
> with no .signal files as a synonym of crash recovery? In the recovery
> path, currently the essence of the problem is when we do
> InArchiveRecovery=true, but ArchiveRecoveryRequested=false, meaning
> that it should do archive recovery but we don't want it, and that does
> not really make sense. The rest of the code sort of implies that this
> is not a suported combination. So basically, my suggestion here, is
> to just replay WAL up to the end of what's in your local pg_wal/ and
> hope for the best, without TLI jumps, except that we'd do nothing.

This sentence seems to be incomplete.

But I was not saying we should treat the case where we have a
backup_label file like crash recovery. The real question here is why
we don't treat it fully like archive recovery. I don't know off-hand
what is different if I start the server with both backup_label and
recovery.signal vs. if I start it with only backup_label, but I
question whether there should be any difference at all.

> A point of contention is if we'd better be stricter about satisfying
> backupEndPoint in such a case, but the redo code only wants to do
> something here when ArchiveRecoveryRequested is set (aka there's a
> .signal file set), and we would not want a TLI jump at the end of
> recovery, so I don't see an argument with caring about backupEndPoint
> in this case.

This is a bit hard for me to understand, but I disagree strongly with
the idea that we should ever ignore a backup end point if we have one.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2023-10-31 12:37:50 Re: Is this a problem in GenericXLogFinish()?
Previous Message Amit Kapila 2023-10-31 12:21:07 Re: Intermittent failure with t/003_logical_slots.pl test on windows