Re: Add recovery to pg_control and remove backup_label

From: David Steele <david(at)pgmasters(dot)net>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add recovery to pg_control and remove backup_label
Date: 2023-11-06 21:39:02
Message-ID: 46c300b9-96a1-449c-8ecd-9003bd02cb6b@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/6/23 02:35, Michael Paquier wrote:
> On Sun, Nov 05, 2023 at 01:45:39PM -0400, David Steele wrote:
>> Rebased on 151ffcf6.
>
> I like this patch a lot. Even if the backup_label file is removed, we
> still have all the debug information from the backup history file,
> thanks to its LABEL, BACKUP METHOD and BACKUP FROM, so no information
> is lost. It does a 1:1 replacement of the contents parsed from the
> backup_label needed by recovery by fetching them from the control
> file. Sounds like a straight-forward change to me.

That's the plan, at least!

> The patch is failing the recovery test 039_end_of_wal.pl. Could you
> look at the failure?

I'm not seeing this failure, and CI seems happy [1]. Can you give
details of the error message?

> /* Build and save the contents of the backup history file */
> - history_file = build_backup_content(state, true);
> + history_file = build_backup_content(state);
>
> build_backup_content() sounds like an incorrect name if it is a
> routine onlyused to build the contents of backup history files.

Good point, I have renamed this to build_backup_history_content().

> Why is there nothing updated in src/bin/pg_controldata/?

Oops, added.

> + /* Clear fields used to initialize recovery */
> + ControlFile->backupCheckPoint = InvalidXLogRecPtr;
> + ControlFile->backupStartPointTLI = 0;
> + ControlFile->backupRecoveryRequired = false;
> + ControlFile->backupFromStandby = false;
>
> These variables in the control file are cleaned up when the
> backup_label file was read previously, but backup_label is renamed to
> backup_label.old a bit later than that. Your logic looks correct seen
> from here, but shouldn't these variables be set much later, aka just
> *after* UpdateControlFile(). This gap between the initialization of
> the control file and the in-memory reset makes the code quite brittle,
> IMO.

If we set these fields where backup_label was renamed, the logic would
not be exactly the same since pg_control won't be updated until the next
time through the loop. Since the fields should be updated before
UpdateControlFile() I thought it made sense to keep all the updates
together.

Overall I think it is simpler, and we don't need to acquire a lock on
ControlFile.

> - basebackup_progress_wait_wal_archive(&state);
> - do_pg_backup_stop(backup_state, !opt->nowait);
>
> Why is that moved?

do_pg_backup_stop() generates the updated pg_control so it needs to run
before we transmit pg_control.

> - The backup label
> - file includes the label string you gave to <function>pg_backup_start</function>,
> - as well as the time at which <function>pg_backup_start</function> was run, and
> - the name of the starting WAL file. In case of confusion it is therefore
> - possible to look inside a backup file and determine exactly which
> - backup session the dump file came from. The tablespace map file includes
> + The tablespace map file includes
>
> It may be worth mentioning that the backup history file holds this
> information on the primary's pg_wal, as well.

OK, reworded.

> The changes in sendFileWithContent() may be worth a patch of its own.

Thomas included this change in his pg_basebackup changes so I did the
same. Maybe wait a bit before we split this out? Seems like a pretty
small change...

> --- a/src/include/catalog/pg_control.h
> +++ b/src/include/catalog/pg_control.h
> @@ -146,6 +146,9 @@ typedef struct ControlFileData
> @@ -160,14 +163,25 @@ typedef struct ControlFileData
> XLogRecPtr minRecoveryPoint;
> TimeLineID minRecoveryPointTLI;
> + XLogRecPtr backupCheckPoint;
> XLogRecPtr backupStartPoint;
> + TimeLineID backupStartPointTLI;
> XLogRecPtr backupEndPoint;
> + bool backupRecoveryRequired;
> + bool backupFromStandby;
>
> This increases the size of the control file from 296B to 312B with an
> 8-byte alignment, as far as I can see. The size of the control file
> has been always a sensitive subject especially with the hard limit of
> PG_CONTROL_MAX_SAFE_SIZE. Well, the point of this patch is that this
> is the price to pay to prevent users from doing something stupid with
> a removal of a backup_label when they should not. Do others have an
> opinion about this increase in size?
>
> Actually, grouping backupStartPointTLI and minRecoveryPointTLI should
> reduce more the size with some alignment magic, no?

I thought about this, but it seemed to me that existing fields had been
positioned to make the grouping logical rather than to optimize
alignment, e.g. minRecoveryPointTLI. Ideally that would have been placed
near backupEndRequired (or vice versa). But if the general opinion is to
rearrange for alignment, I'm OK with that.

> backupRecoveryRequired in the control file is switched to false for
> pg_rewind and true for streamed backups. My gut feeling is telling me
> that this should be OK, as out-of-core tools would need an upgrade if
> they relied on the backend_label file anyway. I can see that this
> change makes use lose some documentation, unfortunately. Shouldn't
> these removed lines be moved to pg_control.h instead for the
> description of backupEndRequired?

Updated description in pg_control.h -- it's a bit vague but not sure it
is a good idea to get into the inner workings of pg_rewind here?

> doc/src/sgml/ref/pg_rewind.sgml and
> src/backend/access/transam/xlogrecovery.c still include references to
> the backup_label file.

Fixed.

Attached is a new patch based on 18b585155.

Regards,
-David

[1] https://cirrus-ci.com/build/4939808120766464

Attachment Content-Type Size
v03-recovery-in-pgcontrol-remove-backuplabel.patch text/plain 56.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-11-06 22:03:08 Re: Explicitly skip TAP tests under Meson if disabled
Previous Message Matthias van de Meent 2023-11-06 21:28:01 Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan