Re: backup manifests

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Suraj Kharage <suraj(dot)kharage(at)enterprisedb(dot)com>
Cc: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Tels <nospam-pg-abuse(at)bloodgate(dot)com>, David Steele <david(at)pgmasters(dot)net>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: backup manifests
Date: 2019-12-18 21:24:03
Message-ID: CA+TgmoZdJ=vBoDU1Skv52Prybp36kMvmpTDBXbmoky0owNgMiA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 17, 2019 at 12:54 AM Suraj Kharage
<suraj(dot)kharage(at)enterprisedb(dot)com> wrote:
> I have implemented the simplehash in backup validator patch as Robert suggested. Please find attached 0002 patch for the same.
>
> kindly review and let me know your thoughts.

+#define CHECKSUM_LENGTH 256

This seems wrong. Not all checksums are the same length, and none of
the ones we're using are 256 bytes in length, and if we've got to have
a constant someplace for the maximum checksum length, it should
probably be in the new header file, not here. But I don't think we
should need this in the first place; see comments below about how to
revise the parsing of the manifest file.

+ char filetype[10];

A mysterious 10-byte field with no comments explaining what it
means... and the same magic number 10 appears in at least one other
place in the patch.

+typedef struct manifesthash_hash *hashtab;

This declares a new *type* called hashtab, not a variable called
hashtab. The new type is not used anywhere, but later, you have
several variables of the same type that have this name. Just remove
this: it's wrong and unused.

+static enum ChecksumAlgorithm checksum_type = MC_NONE;

Remove "enum". Not needed, because you have a typedef for it in the
header, and not per style.

+static manifesthash_hash *create_manifest_hash(char manifest_path[MAXPGPATH]);

Whitespace is wrong. The whole patch needs a visit from pgindent with
a properly-updated typedefs.list.

Also, you will struggle to find anywhere else in the code base where
pass a character array as a function argument. I don't know why this
isn't just char *.

+ if(verify_backup)

Whitespace wrong here, too.

+ * Read the backup_manifest file and generate the hash table, then scan data
+ * directroy and verify each file. Finally, iterate on hash table to find
+ * out missing files.

You've got a word spelled wrong here, but the bigger problem is that
this comment doesn't actually describe what this function is trying to
do. Instead, it describes how it does it. If it's necessary to explain
what steps the function takes in order to accomplish some goal, you
should comment individual bits of code in the function. The header
comment is a high-level overview, not a description of the algorithm.

It's also pretty unhelpful, here and elsewhere, to refer to "the hash
table" as if there were only one, and as if the reader were supposed
to know something about it when you haven't told them anything about
it.

+ if (!entry->matched)
+ {
+ pg_log_info("missing file: %s", entry->filename);
+ }
+

The braces here are not project style. We usually omit braces when
only a single line of code is present.

I think some work needs to be done to standardize and improve the
messages that get produced here. You have:

1. missing file: %s
2. duplicate file present: %s
3. size changed for file: %s, original size: %d, current size: %zu
4. checksum difference for file: %s
5. extra file found: %s

I suggest:

1. file \"%s\" is present in manifest but missing from the backup
2. file \"%s\" has multiple manifest entries
(this one should probably be pg_log_error(), not pg_log_info(), as it
represents a corrupt-manifest problem)
3. file \"%s" has size %lu in manifest but size %lu in backup
4. file \"%s" has checksum %s in manifest but checksum %s in backup
5. file \"%s" is present in backup but not in manifest

Your patch actually doesn't compile on my system, because for the
third message above, it uses %zu to print the size. But %zu is for
size_t, not off_t. I went looking for other places in the code where
we print off_t; based on that, I think the right thing to do is to
print it using %lu and write (unsigned long) st.st_size.

+ char file_checksum[256];
+ char header[1024];

More arbitrary constants.

+ if (!file)
+ {
+ pg_log_error("could not open backup_manifest");

That's bad error reporting. See e.g. readfile() in initdb.c.

+ if (fscanf(file, "%1023[^\n]\n", header) != 1)
+ {
+ pg_log_error("error while reading the header from backup_manifest");

That's also bad error reporting. It is only a slight step up from
"ERROR: error".

And we have another magic number (1023).

+ appendPQExpBufferStr(manifest, header);
+ appendPQExpBufferStr(manifest, "\n");
...
+ appendPQExpBuffer(manifest, "File\t%s\t%d\t%s\t%s\n", filename,
+ filesize, mtime, checksum_with_type);

This whole thing seems completely crazy to me. Basically, you're
trying to use fscanf() to parse the file. But then, because fscanf()
doesn't give you the original bytes back, you're trying to reassemble
the data that you parsed to recover the original line, so that you can
stuff it in the buffer and eventually checksum it. However, that's
highly error-prone. You're basically duplicating server code, and thus
risking getting out of sync in the server code, to work around a
problem that is entirely self-inflicted, namely, deciding to use
fscanf().

What I would recommend is:

1. Use open(), read(), close() rather than the fopen() family of
functions. As we have discovered elsewhere, fread() doesn't promise to
set errno, so we can't necessarily get reliable error-reporting out of
it.

2. Before you start reading the file, create a buffer that's large
enough to hold the whole thing, by using fstat() to figure out how big
the file is. Read the whole file into that buffer. If you're not able
to read the whole file -- i.e. open() or read() or close() fail --
then just error out and exit.

3. Now advance through the file line by line. Write a function that
knows how to search forward for the next \r or \n but with checks to
make sure it can't run off the end of the buffer, and use that to
locate the end of each line so that you can walk forward. As you walk
forward line by line, add the line you just processed to the checksum.
That way, you only need a single pass over the data. Also, you can
modify it in place. More on that below.

4. As you examine each line, start by examining the first word. You'll
need a function that finds the first word by searching forward for a
tab character, but not beyond the end of the line. The first word of
the first line should be PostgreSQL-Backup-Manifest-Version and the
second word should be 1. Then on each subsequent line check whether
the first word is File or Manifest-Checksum or something else,
erroring out in the last case. If it's Manifest-Checksum, verify that
this is the last line of the file and that the checksum matches. If
it's File, break the line into fields so you can add it to the hash
table. You'll want a pointer to the filename and a pointer to the
checksum, and you'll want to parse the size as an integer. Instead of
allocating new memory for those fields, just overwrite the character
that follows the field with a \0. There must be one - either \t or \n
- so you shouldn't run off the end of the buffer.

If you do this, a bunch of the fixed-size buffers you have right now
go away. You don't need the variable filetype[10] any more, or
checksum_with_type[CHECKSUM_LENGTH], or checksum[CHECKSUM_LENGTH], or
the character arrays inside DataDirectoryFileInfo. Instead you can
just have pointers into the buffer that contains the file. And you
don't need this code to back up using fseek() and reread the lines,
either.

Also read this article:

https://stackoverflow.com/questions/2430303/disadvantages-of-scanf

Note that the very first point in the article talks about the problem
of overrunning the buffer, which you certainly have in the current
code right here:

+ if (fscanf(file, "%s\t%s\t%d\t%23[^\t] %s\n", filetype, filename,

filetype is declared as char[10], but %s could read arbitrarily much data.

+ filename = (char*) pg_malloc(MAXPGPATH);

pg_malloc returns void *, so no cast is required.

+ if (strcmp(checksum_with_type, "-") == 0)
+ {
+ checksum_type = MC_NONE;
+ }
+ else
+ {
+ if (strncmp(checksum_with_type, "SHA256", 6) == 0)

Use parse_checksum_algorithm. Right now you've invented a "common"
function with 1 caller, but I explicitly suggested previously that you
put it in common so that you could reuse it.

+ if (strcmp(de->d_name, ".") == 0 || strcmp(de->d_name, "..") == 0 ||
+ strcmp(de->d_name, "pg_wal") == 0)
+ continue;

Ignoring pg_wal at the top level might be OK, but this will ignore a
pg_wal entry anywhere in the directory tree.

+ /* Skip backup manifest file. */
+ if (strcmp(de->d_name, "backup_manifest") == 0)
+ return;

Same problem.

+ filename = createPQExpBuffer();
+ if (!filename)
+ {
+ pg_log_error("out of memory");
+ exit(1);
+ }
+
+ appendPQExpBuffer(filename, "%s%s", relative_path, de->d_name);

Just use char filename[MAXPGPATH] and snprintf here, as you do
elsewhere. It will be simpler and save memory.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-12-18 21:24:50 Re: Read Uncommitted
Previous Message David G. Johnston 2019-12-18 20:36:48 Re: Restore backup file "with oids"