Re: Add recovery to pg_control and remove backup_label

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: David Steele <david(at)pgmasters(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add recovery to pg_control and remove backup_label
Date: 2024-01-26 12:57:30
Message-ID: CALDaNm0_yyfrpHXDZe-rwXnzBjiFUNsab_-YA=k2TTC-NT0bCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 20 Nov 2023 at 06:46, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> (I am not exactly sure how, but we've lost pgsql-hackers on the way
> when you sent v5. Now added back in CC with the two latest patches
> you've proposed attached.)
>
> Here is a short summary of what has been missed by the lists:
> - I've commented that the patch should not create, not show up in
> fields returned the SQL functions or stream control files with a size
> of 512B, just stick to 8kB. If this is worth changing this should be
> applied consistently across the board including initdb, discussed on
> its own thread.
> - The backup-related fields in the control file are reset at the end
> of recovery. I've suggested to not do that to keep a trace of what
> was happening during recovery. The latest version of the patch resets
> the fields.
> - With the backup_label file gone, we lose some information in the
> backups themselves, which is not good. Instead, you have suggested an
> approach where this data is added to the backup manifest, meaning that
> no information would be lost, particularly useful for self-contained
> backups. The fields planned to be added to the backup manifest are:
> -- The start and end time of the backup, the end timestamp being
> useful to know when stop time can be used for PITR.
> -- The backup label.
> I've agreed that it may be the best thing to do at this end to not
> lose any data related to the removal of the backup_label file.
>
> On Sun, Nov 19, 2023 at 02:14:32PM -0400, David Steele wrote:
> > On 11/15/23 20:03, Michael Paquier wrote:
> >> As the label is only an informational field, the parsing added to
> >> pg_verifybackup is not really needed because it is used nowhere in the
> >> validation process, so keeping the logic simpler would be the way to
> >> go IMO. This is contrary to the WAL range for example, where start
> >> and end LSNs are used for validation with a pg_waldump command.
> >> Robert, any comments about the addition of the label in the manifest?
> >
> > I'm sure Robert will comment on this when he gets the time, but for now I
> > have backed off on passing the new info to pg_verifybackup and added
> > start/stop time.
>
> FWIW, I'm OK with the bits for the backup manifest as presented. So
> if there are no remarks and/or no objections, I'd like to apply it but
> let give some room to others to comment on that as there's been a gap
> in the emails exchanged on pgsql-hackers. I hope that the summary
> I've posted above covers everything. So let's see about doing
> something around the middle of next week. With Thanksgiving in the
> US, a lot of folks will not have the time to monitor what's happening
> on this thread.
>
> + The end time for the backup. This is when the backup was stopped in
> + <productname>PostgreSQL</productname> and represents the earliest time
> + that can be used for time-based Point-In-Time Recovery.
>
> This one is actually a very good point. We'd lost this capacity with
> the backup_label file gone without the end timestamps in the control
> file.
>
> > New patches attached based on b218fbb7.
>
> I've noticed on the other thread the remark about being less
> aggressive with the fields related to recovery in the control file, so
> I assume that this patch should leave the fields be after the end of
> recovery from the start and only rely on backupRecoveryRequired to
> decide if the recovery should use the fields or not:
> https://www.postgresql.org/message-id/241ccde1-1928-4ba2-a0bb-5350f7b191a8@=pgmasters.net
>
> + ControlFile->backupCheckPoint = InvalidXLogRecPtr;
> ControlFile->backupStartPoint = InvalidXLogRecPtr;
> + ControlFile->backupStartPointTLI = 0;
> ControlFile->backupEndPoint = InvalidXLogRecPtr;
> + ControlFile->backupFromStandby = false;
> ControlFile->backupEndRequired = false;
>
> Still, I get the temptation of being consistent with the current style
> on HEAD to reset everything, as well..

CFBot shows that the patch does not apply anymore as in [1]:

=== Applying patches on top of PostgreSQL commit ID
7014c9a4bba2d1b67d60687afb5b2091c1d07f73 ===
=== applying patch ./recovery-in-pgcontrol-v7-0001-add-info-to-manifest.patch
patching file doc/src/sgml/backup-manifest.sgml
patching file src/backend/backup/backup_manifest.c
patching file src/backend/backup/basebackup.c
Hunk #1 succeeded at 238 (offset 13 lines).
Hunk #2 succeeded at 258 (offset 13 lines).
Hunk #3 succeeded at 399 (offset 17 lines).
Hunk #4 succeeded at 652 (offset 17 lines).
can't find file to patch at input line 219
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|diff --git a/src/bin/pg_verifybackup/parse_manifest.c
b/src/bin/pg_verifybackup/parse_manifest.c
|index bf0227c668..408af88e58 100644
|--- a/src/bin/pg_verifybackup/parse_manifest.c
|+++ b/src/bin/pg_verifybackup/parse_manifest.c
--------------------------
No file to patch. Skipping patch.
9 out of 9 hunks ignored
patching file src/include/backup/backup_manifest.h

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_3511.log

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2024-01-26 13:01:12 Re: [PATCH] Add sortsupport for range types and btree_gist
Previous Message vignesh C 2024-01-26 12:54:01 Re: Exposing the lock manager's WaitForLockers() to SQL