Re: backup manifests

From: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
To: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tels <nospam-pg-abuse(at)bloodgate(dot)com>, David Steele <david(at)pgmasters(dot)net>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: backup manifests
Date: 2019-12-09 05:45:23
Message-ID: CAM2+6=Vyj03SwJ7Zya5qD=VbKU+ZyaK+erCe9yXGAPi41uyFcQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 6, 2019 at 12:05 PM Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
wrote:

>
>
> On Fri, Dec 6, 2019 at 1:44 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>> On Thu, Dec 5, 2019 at 11:22 AM Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
>> wrote:
>> > Here is the whole stack of patches.
>>
>> I committed 0001, as that's just refactoring and I think (hope) it's
>> uncontroversial. I think 0002-0005 need to be squashed together
>> (crediting all authors properly and in the appropriate order) as it's
>> quite hard to understand right now,
>
>
> Please find attached single patch and I tried to add the credit to all
> the authors.
>

I had a look over the patch and here are my few review comments:

1.
+ if (pg_strcasecmp(manifest_checksum_algo, "SHA256") == 0)
+ manifest_checksums = MC_SHA256;
+ else if (pg_strcasecmp(manifest_checksum_algo, "CRC32C") == 0)
+ manifest_checksums = MC_CRC32C;
+ else if (pg_strcasecmp(manifest_checksum_algo, "NONE") == 0)
+ manifest_checksums = MC_NONE;
+ else
+ ereport(ERROR,

Is NONE is a valid input? I think the default is "NONE" only and thus no
need
of this as an input. It will be better if we simply error out if input is
neither "SHA256" nor "CRC32C".

I believe you have done this way as from pg_basebackup you are always
passing
MANIFEST_CHECKSUMS '%s' string which will have "NONE" if no user input is
given. But I think passing that conditional will be better like we have
maxrate_clause for example.

Well, this is what I think, feel free to ignore as I don't see any
correctness
issue over here.

2.
+ if (manifest_checksums != MC_NONE)
+ {
+ checksumbuflen = finalize_manifest_checksum(cCtx, checksumbuf);
+ switch (manifest_checksums)
+ {
+ case MC_NONE:
+ break;
+ }

Since switch case is within "if (manifest_checksums != MC_NONE)" condition,
I don't think we need a case for MC_NONE here. Rather we can use a default
case to error out.

3.
+ if (manifest_checksums != MC_NONE)
+ {
+ initialize_manifest_checksum(&cCtx);
+ update_manifest_checksum(&cCtx, content, len);
+ }

@@ -1384,6 +1641,9 @@ sendFile(const char *readfilename, const char
*tarfilename, struct stat *statbuf
int segmentno = 0;
char *segmentpath;
bool verify_checksum = false;
+ ChecksumCtx cCtx;
+
+ initialize_manifest_checksum(&cCtx);

I see that in a few cases you are calling
initialize/update_manifest_checksum()
conditional and at some other places call is unconditional. It seems like
calling unconditional will not have any issues as switch cases inside them
return doing nothing when manifest_checksums is MC_NONE.

4.
initialize/update/finalize_manifest_checksum() functions may be needed by
the
validation patch as well. And thus I think these functions should not depend
on a global variable as such. Also, it will be good if we keep them in a
file
that is accessible to frontend-only code. Well, you can ignore these
comments
with the argument saying that this refactoring can be done by the patch
adding
validation support. I have no issues. Since both the patches are dependent
and
posted on the same email chain, thought of putting that observation.

5.
+ switch (manifest_checksums)
+ {
+ case MC_SHA256:
+ checksumlabel = "SHA256:";
+ break;
+ case MC_CRC32C:
+ checksumlabel = "CRC32C:";
+ break;
+ case MC_NONE:
+ break;
+ }

This code in AddFileToManifest() is executed for every file for which we are
adding an entry. However, the checksumlabel will be going to remain the same
throughout. Can it be set just once and then used as is?

6.
Can we avoid manifest_checksums from declaring it as a global variable?
I think for that, we need to pass that to every function and thus need to
change the function signature of various functions. Currently, we pass
"StringInfo manifest" to all the required function, will it better to pass
the struct variable instead? A struct may have members like,
"StringInfo manifest" in it, checksum type (manifest_checksums),
checksum label, etc.

Thanks
--
Jeevan Chalke
Associate Database Architect & Team Lead, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2019-12-09 07:53:54 ALTER TABLE support for dropping generation expression
Previous Message Mark Dilger 2019-12-09 04:55:19 Re: [Proposal] Level4 Warnings show many shadow vars