Re: backup manifests

From: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
To: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tels <nospam-pg-abuse(at)bloodgate(dot)com>, David Steele <david(at)pgmasters(dot)net>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: backup manifests
Date: 2019-12-09 09:22:34
Message-ID: CAGPqQf2N9-xQYH=azPRzGUvqAMhMYemy3Gsy2sZnVjD2s+3fRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks Jeevan for reviewing the patch and offline discussion.

On Mon, Dec 9, 2019 at 11:15 AM Jeevan Chalke <
jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:

>
>
> On Fri, Dec 6, 2019 at 12:05 PM Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
> wrote:
>
>>
>>
>> On Fri, Dec 6, 2019 at 1:44 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>
>>> On Thu, Dec 5, 2019 at 11:22 AM Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
>>> wrote:
>>> > Here is the whole stack of patches.
>>>
>>> I committed 0001, as that's just refactoring and I think (hope) it's
>>> uncontroversial. I think 0002-0005 need to be squashed together
>>> (crediting all authors properly and in the appropriate order) as it's
>>> quite hard to understand right now,
>>
>>
>> Please find attached single patch and I tried to add the credit to all
>> the authors.
>>
>
> I had a look over the patch and here are my few review comments:
>
> 1.
> + if (pg_strcasecmp(manifest_checksum_algo, "SHA256") == 0)
> + manifest_checksums = MC_SHA256;
> + else if (pg_strcasecmp(manifest_checksum_algo, "CRC32C") == 0)
> + manifest_checksums = MC_CRC32C;
> + else if (pg_strcasecmp(manifest_checksum_algo, "NONE") == 0)
> + manifest_checksums = MC_NONE;
> + else
> + ereport(ERROR,
>
> Is NONE is a valid input? I think the default is "NONE" only and thus no
> need
> of this as an input. It will be better if we simply error out if input is
> neither "SHA256" nor "CRC32C".
>
> I believe you have done this way as from pg_basebackup you are always
> passing
> MANIFEST_CHECKSUMS '%s' string which will have "NONE" if no user input is
> given. But I think passing that conditional will be better like we have
> maxrate_clause for example.
>
> Well, this is what I think, feel free to ignore as I don't see any
> correctness
> issue over here.
>
>
I would still keep this NONE as it's look more cleaner in the say of
given options to the checksums.

> 2.
> + if (manifest_checksums != MC_NONE)
> + {
> + checksumbuflen = finalize_manifest_checksum(cCtx, checksumbuf);
> + switch (manifest_checksums)
> + {
> + case MC_NONE:
> + break;
> + }
>
> Since switch case is within "if (manifest_checksums != MC_NONE)" condition,
> I don't think we need a case for MC_NONE here. Rather we can use a default
> case to error out.
>
>
Yeah, with the new patch we don't have this part of code.

> 3.
> + if (manifest_checksums != MC_NONE)
> + {
> + initialize_manifest_checksum(&cCtx);
> + update_manifest_checksum(&cCtx, content, len);
> + }
>
> @@ -1384,6 +1641,9 @@ sendFile(const char *readfilename, const char
> *tarfilename, struct stat *statbuf
> int segmentno = 0;
> char *segmentpath;
> bool verify_checksum = false;
> + ChecksumCtx cCtx;
> +
> + initialize_manifest_checksum(&cCtx);
>
>
> I see that in a few cases you are calling
> initialize/update_manifest_checksum()
> conditional and at some other places call is unconditional. It seems like
> calling unconditional will not have any issues as switch cases inside them
> return doing nothing when manifest_checksums is MC_NONE.
>
>
Fixed.

> 4.
> initialize/update/finalize_manifest_checksum() functions may be needed by
> the
> validation patch as well. And thus I think these functions should not
> depend
> on a global variable as such. Also, it will be good if we keep them in a
> file
> that is accessible to frontend-only code. Well, you can ignore these
> comments
> with the argument saying that this refactoring can be done by the patch
> adding
> validation support. I have no issues. Since both the patches are dependent
> and
> posted on the same email chain, thought of putting that observation.
>
>
Make sense, I just changed those API to that it doesn't have to
access the global.

> 5.
> + switch (manifest_checksums)
> + {
> + case MC_SHA256:
> + checksumlabel = "SHA256:";
> + break;
> + case MC_CRC32C:
> + checksumlabel = "CRC32C:";
> + break;
> + case MC_NONE:
> + break;
> + }
>
> This code in AddFileToManifest() is executed for every file for which we
> are
> adding an entry. However, the checksumlabel will be going to remain the
> same
> throughout. Can it be set just once and then used as is?
>
>
Yeah, with the attached patch we no more have this part of code.

> 6.
> Can we avoid manifest_checksums from declaring it as a global variable?
> I think for that, we need to pass that to every function and thus need to
> change the function signature of various functions. Currently, we pass
> "StringInfo manifest" to all the required function, will it better to pass
> the struct variable instead? A struct may have members like,
> "StringInfo manifest" in it, checksum type (manifest_checksums),
> checksum label, etc.
>
>
I agree. Earlier I was not sure about this because that require data
structure
to expose. But in the given attached patch that's what I tried, introduced
new
data structure and defined in basebackup.h and passed the same through the
function so that doesn't require to pass an individual members. Also
removed
global manifest_checksum and added the same in the newly introduced
structure.

Attaching the patch, which need to apply on the top of earlier 0001 patch.

Thanks,

--
Rushabh Lathia
www.EnterpriseDB.com

Attachment Content-Type Size
0002_new_data_structure.patch text/x-patch 18.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message amul sul 2019-12-09 09:38:53 Re: [HACKERS] advanced partition matching algorithm for partition-wise join
Previous Message Amit Kapila 2019-12-09 09:07:17 Re: Questions/Observations related to Gist vacuum