Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: David Zhang <david(dot)zhang(at)highgo(dot)ca>
Cc: Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
Date: 2023-07-19 23:19:13
Message-ID: ZLhvcbKQ++NfpmX8@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 19, 2023 at 11:21:17AM -0700, David Zhang wrote:
> 1) simply start server from a base backup
>
> FATAL:  could not find recovery.signal or standby.signal when recovering
> with backup_label
>
> HINT:  If you are restoring from a backup, touch
> "/media/david/disk1/pg_backup1/recovery.signal" or
> "/media/david/disk1/pg_backup1/standby.signal" and add required recovery
> options.

Note the difference when --write-recovery-conf is specified, where a
standby.conf is created with a primary_conninfo in
postgresql.auto.conf. So, yes, that's expected by default with the
patch.

> 2) touch a recovery.signal file and then try to start the server, the
> following error was encountered:
>
> FATAL:  must specify restore_command when standby mode is not enabled

Yes, that's also something expected in the scope of the v1 posted.
The idea behind this restriction is that specifying recovery.signal is
equivalent to asking for archive recovery, but not specifying
restore_command is equivalent to not provide any options to be able to
recover. See validateRecoveryParameters() and note that this
restriction exists since the beginning of times, introduced in commit
66ec2db. I tend to agree that there is something to be said about
self-contained backups taken from pg_basebackup, though, as these
would fail if no restore_command is specified, and this restriction is
in place before Postgres has introduced replication and easier ways to
have base backups. As a whole, I think that there is a good argument
in favor of removing this restriction for the case where archive
recovery is requested if users have all their WAL in pg_wal/ to be
able to recover up to a consistent point, keeping these GUC
restrictions if requesting a standby (not recovery.signal, only
standby.signal).

> 3) touch a standby.signal file, then the server successfully started,
> however, it operates in standby mode, whereas the intended behavior was for
> it to function as a primary server.

standby.signal implies that the server will start in standby mode. If
one wants to deploy a new primary, that would imply a timeline jump at
the end of recovery, you would need to specify recovery.signal
instead.

We need more discussions and more opinions, but the discussion has
stalled for a few months now. In case, I am adding Thomas Munro in CC
who has mentioned to me at PGcon that he was interested in this patch
(this thread's problem is not directly related to the fact that the
checkpointer now runs in crash recovery, though).

For now, I am attaching a rebased v2.
--
Michael

Attachment Content-Type Size
v2-0001-Strengthen-use-of-ArchiveRecoveryRequested-and-In.patch text/x-diff 7.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-07-19 23:34:05 Re: Support to define custom wait events for extensions
Previous Message Nathan Bossart 2023-07-19 22:41:32 Re: Move un-parenthesized syntax docs to "compatibility" for few SQL commands