Re: backup manifests

From: David Steele <david(at)pgmasters(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Suraj Kharage <suraj(dot)kharage(at)enterprisedb(dot)com>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Tels <nospam-pg-abuse(at)bloodgate(dot)com>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: backup manifests
Date: 2020-03-27 20:16:11
Message-ID: 557f7af7-af3e-5fb8-1c7a-9fdd5c488ebc@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/27/20 3:29 PM, Robert Haas wrote:
> On Fri, Mar 27, 2020 at 11:26 AM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>>> Seems better to (later?) add support for generating manifests for WAL
>>> files, and then have a tool that can verify all the manifests required
>>> to restore a base backup.
>>
>> I'm not trying to expand on the feature set here or move the goalposts
>> way down the road, which is what seems to be what's being suggested
>> here. To be clear, I don't have any objection to adding a generic tool
>> for validating WAL as you're talking about here, but I also don't think
>> that's required for pg_validatebackup. What I do think we need is a
>> check of the WAL that's fetched when people use pg_basebackup -Xstream
>> or -Xfetch. pg_basebackup itself has that check because it's critical
>> to the backup being successful and valid. Not having that basic
>> validation of a backup really just isn't ok- there's a reason
>> pg_basebackup has that check.
>
> I don't understand how this could be done without significantly
> complicating the architecture. As I said before, -Xstream sends WAL
> over a separate connection that is unrelated to the one running
> BASE_BACKUP, so the base-backup connection doesn't know what to
> include in the manifest. Now you could do something like: once all of
> the WAL files have been fetched, the client checksums all of those and
> sends their names and checksums to the server, which turns around and
> puts them into the manifest, which it then sends back to the client.
> But that is actually quite a bit of additional complexity, and it's
> pretty strange, too, because now you have the client checksumming some
> files and the server checksumming others. I know you mentioned a few
> different ideas before, but I think they all kinda have some problem
> along these lines.
>
> I also kinda disagree with the idea that the WAL should be considered
> an integral part of the backup. I don't know how pgbackrest does
> things,

We checksum each WAL file while it is read and transmitted to the repo
by the archive_command. Then at the end of the backup we ensure that
all the WAL required to make the backup consistent has made it to the repo.

> but BART stores each backup in a separate directly without any
> associated WAL, and then keeps all the WAL together in a different
> directory. I imagine that people who are using continuous archiving
> also tend to use -Xnone, or if they do backups by copying the files
> rather than using pg_backrest, they exclude pg_wal. In fact, for
> people with big, important databases, I'd assume that would be the
> normal pattern. You presumably wouldn't want to keep one copy of the
> WAL files taken during the backup with the backup itself, and a
> separate copy in the archive.

pgBackRest does provide the option to copy WAL into the backup directory
for the super-paranoid, though it is not the default. It is pretty handy
for moving individual backups some other medium like tape, though.

If -Xnone is specified then it seems like pg_validatebackup is
completely off the hook. But in the case of -Xstream or -Xfetch
couldn't we at least verify that the expected WAL segments are present
and the correct size?

Storing the start/stop lsn in the manifest would be a nice thing to have
anyway and that would make this feature pretty trivial. Yeah, that's in
the backup_label file as well but the manifest is so much easier to read.

Regards,
--
-David
david(at)pgmasters(dot)net

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2020-03-27 20:27:19 Re: proposal \gcsv
Previous Message Justin Pryzby 2020-03-27 20:15:42 Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly