Re: backup manifests

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: David Steele <david(at)pgmasters(dot)net>, Stephen Frost <sfrost(at)snowman(dot)net>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Suraj Kharage <suraj(dot)kharage(at)enterprisedb(dot)com>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Tels <nospam-pg-abuse(at)bloodgate(dot)com>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: backup manifests
Date: 2020-03-30 00:42:35
Message-ID: CA+TgmobSzurSGzy4uLXq__h3pKFnb4RFHepa6yxs3h4d3gwGEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 28, 2020 at 11:40 PM Noah Misch <noah(at)leadboat(dot)com> wrote:
> Stephen Frost mentioned that a backup could pass validation even if
> pg_basebackup were killed after writing the base backup and before finishing
> the writing of pg_wal. One might avoid that by simply writing the manifest to
> a temporary name and renaming it to the final name after populating pg_wal.

Huh, that's an idea. I'll have a look at the code and see what would
be involved.

> What do you think of having the verification process also call pg_waldump to
> validate the WAL CRCs (shown upthread)? That looked helpful and simple.

I don't love calls to external binaries, but I think the thing that
really bothers me is that pg_waldump is practically bound to terminate
with an error, because the last WAL segment will end with a partial
record. For the same reason, I think there's really no such thing as
validating a single WAL file. I suppose you'd need to know the exact
start and end locations for a minimal WAL replay and check that all
records between those LSNs appear OK, ignoring any apparent problems
after the minimum ending point, or at least ignoring any problems due
to an incomplete record in the last file. We don't have a tool for
that currently, and I don't think I can write one this week. Or at
least, not a good one.

> I think this functionality doesn't belong in its own program. If you suspect
> pg_basebackup or pg_restore will eventually gain the ability to merge
> incremental backups into a recovery-ready base backup, I would put the
> functionality in that program. Otherwise, I would put it in pg_checksums.
> For me, part of the friction here is that the program description indicates
> general verification, but the actual functionality merely checks hashes on a
> directory tree that happens to represent a PostgreSQL base backup.

Suraj's original patch made this part of pg_basebackup, but I didn't
really like that, because I wanted it to have its own set of options.
I still think all the options I've added are pretty useful ones, and I
can think of other things somebody might want to do. It feels very
uncomfortable to make pg_basebackup, or pg_checksums, take either
options from set A and do thing X, or options from set B and do thing
Y. But it feels clear that the name pg_validatebackup is not going
over very well with anyone. I think I should rename it to
pg_validatemanifest.

> > + parse->pathname = palloc(raw_length + 1);
>
> I don't see this freed anywhere; is it? (It's useful to make peak memory
> consumption not grow in proportion to the number of files backed up.)

We need the hash table to remain populated for the whole run time of
the tool, because we're essentially doing a full join of the actual
directory contents against the manifest contents. That's a bit
unfortunate but it doesn't seem simple to improve. I think the only
people who are really going to suffer are people who have an enormous
pile of empty or nearly-empty relations. People who have large
databases for the normal reason - i.e. a reasonable number of tables
that hold a lot of data - will have manifests of very manageable size.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-03-30 00:47:40 Re: backup manifests
Previous Message Robert Haas 2020-03-30 00:33:51 Re: backup manifests