Add recovery to pg_control and remove backup_label

From: David Steele <david(at)pgmasters(dot)net>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Add recovery to pg_control and remove backup_label
Date: 2023-10-26 21:02:20
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


This was originally proposed in [1] but that thread went through a
number of different proposals so it seems better to start anew.

The basic idea here is to simplify and harden recovery by getting rid of
backup_label and storing recovery information directly in pg_control.
Instead of backup software copying pg_control from PGDATA, it stores an
updated version that is returned from pg_backup_stop(). I believe this
is better for the following reasons:

* The user can no longer remove backup_label and get what looks like a
successful restore (while almost certainly causing corruption). If
pg_control is removed the cluster will not start. The user may try
pg_resetwal, but I think that tool makes it pretty clear that corruption
will result from its use. We could also modify pg_resetwal to complain
if recovery info is present in pg_control.

* We don't need to worry about backup software seeing a torn copy of
pg_control, since Postgres can safely read it out of memory and provide
a valid copy via pg_backup_stop(). This solves [2] without needing to
write pg_control via a temp file, which may affect performance on a
standby. Unfortunately, this solution cannot be back patched.

* For backup from standby, we no longer need to instruct the backup
software to copy pg_control last. In fact the backup software should not
copy pg_control from PGDATA at all.

Since backup_label is now gone, the fields that used to be in
backup_label are now provided as columns returned from pg_backup_start()
and pg_backup_stop() and the backup history file is still written to the
archive. For pg_basebackup we would have the option of writing the
fields into the JSON manifest, storing them to a file (e.g., or just ignoring them. None of the fields are required for
recovery but backup software may be very interested in them.

I updated pg_rewind but I'm not very confident in the tests. When I
removed backup_label processing, but before I updated pg_rewind to write
recovery info into pg_control, all the rewind tests passed.

This patch highlights the fact that we still have no tests for the
low-level backup method. I modified pgBackRest to work with this patch
and the entire test suite ran without any issues, but in-core tests
would be good to have. I'm planning to work on those myself as a
separate patch.

This patch would also make the proposal in [3] obsolete since there is
no need to rename backup_label if it is gone.

I know that outputting pg_control as bytea is going to be a bit
controversial. Software that is using psql get run pg_backup_stop()
could use encode() to get pg_control as text and then decode it later.
Alternately, we could update ReadControlFile() to recognize a
base64-encoded pg_control file. I'm not sure dealing with binary data is
that much of a problem, though, and if the backup software gets it wrong
then recovery with fail on an invalid pg_control file.

Lastly, I think there are improvements to be made in recovery that go
beyond this patch. I originally set out to load the recovery info into
*just* the existing fields in pg_control but it required so many changes
to recovery that I decided it was too dangerous to do all in one patch.
This patch very much takes the "backup_label in pg_control" approach,
though I reused fields where possible. The added fields, e.g.
backupRecoveryRequested, also allow us to keep the user experience
pretty much the same in terms of messages and errors.




Attachment Content-Type Size
v01-recovery-in-pgcontrol-remove-backuplabel.patch text/plain 52.7 KB


Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2023-10-26 21:04:44 Re: POC, WIP: OR-clause support for indexes
Previous Message Nikita Malakhov 2023-10-26 20:56:12 Re: RFC: Pluggable TOAST