Re: backup manifests

From: David Steele <david(at)pgmasters(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: backup manifests
Date: 2019-09-20 23:11:47
Message-ID: fd84612d-8bf4-0db1-14cf-02aa4f9ca396@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 9/20/19 2:55 PM, Robert Haas wrote:
> On Fri, Sep 20, 2019 at 11:09 AM David Steele <david(at)pgmasters(dot)net> wrote:
>>
>> It sucks to make that a prereq for this project but the longer we kick
>> that can down the road...
>
> There are no doubt many patches that would benefit from having more
> backend infrastructure exposed in frontend contexts, and I think we're
> slowly moving in that direction, but I generally do not believe in
> burdening feature patches with major infrastructure improvements.

The hardest part about technical debt is knowing when to incur it. It
is never a cut-and-dried choice.

>> This talk was good fun. The largest number of tables we've seen is a
>> few hundred thousand, but that still adds up to more than a million
>> files to backup.
>
> A quick survey of some of my colleagues turned up a few examples of
> people with 2-4 million files to backup, so similar kind of ballpark.
> Probably not big enough for the manifest to hit the 1GB mark, but
> getting close.

I have so many doubts about clusters with this many tables, but we do
support it, so...

>>> I hear you saying that this is going to end up being just as complex
>>> in the end, but I don't think I believe it. It sounds to me like the
>>> difference between spending a couple of hours figuring this out and
>>> spending a couple of months trying to figure it out and maybe not
>>> actually getting anywhere.
>>
>> Maybe the initial implementation will be easier but I am confident we'll
>> pay for it down the road. Also, don't we want users to be able to read
>> this file? Do we really want them to need to cook up a custom parser in
>> Perl, Go, Python, etc.?
>
> Well, I haven't heard anybody complain that they can't read a
> backup_label file because it's too hard to cook up a parser. And I
> think the reason is pretty clear: such files are not hard to parse.
> Similarly for a pg_hba.conf file. This case is a little more
> complicated than those, but AFAICS, not enormously so. Actually, it
> seems like a combination of those two cases: it has some fixed
> metadata fields that can be represented with one line per field, like
> a backup_label, and then a bunch of entries for files that are
> somewhat like entries in a pg_hba.conf file, in that they can be
> represented by a line per record with a certain number of fields on
> each line.

Yeah, they are not hard to parse, but *everyone* has to cook up code for
it. A bit of a bummer, that.

> I attach here a couple of patches. The first one does some
> refactoring of relevant code in pg_basebackup, and the second one adds
> checksum manifests using a format that I pulled out of my ear. It
> probably needs some adjustment but I don't think it's crazy. Each
> file gets a line that looks like this:
>
> File $FILENAME $FILESIZE $FILEMTIME $FILECHECKSUM

We also include page checksum validation failures in the file record.
Not critical for the first pass, perhaps, but something to keep in mind.

> Right now, the file checksums are computed using SHA-256 but it could
> be changed to anything else for which we've got code. On my system,
> shasum -a256 $FILE produces the same answer that shows up here. At
> the bottom of the manifest there's a checksum of the manifest itself,
> which looks like this:
>
> Manifest-Checksum
> 385fe156a8c6306db40937d59f46027cc079350ecf5221027d71367675c5f781
>
> That's a SHA-256 checksum of the file contents excluding the final
> line. It can be verified by feeding all the file contents except the
> last line to shasum -a256. I can't help but observe that if the file
> were defined to be a JSONB blob, it's not very clear how you would
> include a checksum of the blob contents in the blob itself, but with a
> format based on a bunch of lines of data, it's super-easy to generate
> and super-easy to write tools that verify it.

You can do this in JSON pretty easily by handling the terminating
brace/bracket:

{
<some json contents>*,
"checksum":<sha256>
}

But of course a linefeed-delimited file is even easier.

> This is just a prototype so I haven't written a verification tool, and
> there's a bunch of testing and documentation and so forth that would
> need to be done aside from whatever we've got to hammer out in terms
> of design issues and file formats. But I think it's cool, and perhaps
> some discussion of how it could be evolved will get us closer to a
> resolution everybody can at least live with.

I had a quick look and it seems pretty reasonable. I'll need to
generate a manifest to see if I can spot any obvious gotchas.

--
-David
david(at)pgmasters(dot)net

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-09-20 23:20:00 Re: Wrong results using initcap() with non normalized string
Previous Message Andres Freund 2019-09-20 22:43:54 Re: WIP: Generic functions for Node types using generated metadata