Re: backup manifests

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: David Steele <david(at)pgmasters(dot)net>, Noah Misch <noah(at)leadboat(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Suraj Kharage <suraj(dot)kharage(at)enterprisedb(dot)com>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Tels <nospam-pg-abuse(at)bloodgate(dot)com>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: backup manifests
Date: 2020-03-30 21:08:01
Message-ID: 20200330210801.pkcqcslkw7sck6sf@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-03-30 15:23:08 -0400, Robert Haas wrote:
> On Mon, Mar 30, 2020 at 2:59 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > I wonder if it'd not be best, independent of whether we build in this
> > verification, to include that metadata in the manifest file. That's for
> > sure better than having to build a separate tool to parse timeline
> > history files.
>
> I don't think that's better, or at least not "for sure better". The
> backup_label going to include the START TIMELINE, and if -Xfetch is
> used, we're also going to have all the timeline history files. If the
> backup manifest includes those same pieces of information, then we've
> got two sources of truth: one copy in the files the server's actually
> going to read, and another copy in the backup_manifest which we're
> going to potentially use for validation but ignore at runtime. That
> seems not great.

The data in the backup label isn't sufficient though. Without having
parsed the timeline file there's no way to verify that the correct WAL
is present. I guess we can also add client side tools to parse
timelines, add command the fetch all of the required files, and then
interpret that somehow.

But that seems much more complicated.

Imo it makes sense to want to be able verify that WAL looks correct even
transporting WAL using another method (say archiving) and thus using
pg_basebackup's -Xnone.

For the manifest to actually list what's required for the base backup
doesn't seem redundant to me. Imo it makes the manifest file make a good
bit more sense, since afterwards it actually describes the whole base
backup.

Taking the redundancy agreement a bit further you can argue that we
don't need a list of relation files at all, since they're in the catalog
:P. Obviously going to that extreme doesn't make all that much
sense... But I do think it's a second source of truth that's independent
of what the backends actually are going to read.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-03-30 21:20:17 Re: extension patch of CREATE OR REPLACE TRIGGER
Previous Message Shay Rojansky 2020-03-30 21:06:39 Re: Error on failed COMMIT