Re: backup manifests

From: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
To: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tels <nospam-pg-abuse(at)bloodgate(dot)com>, David Steele <david(at)pgmasters(dot)net>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: backup manifests
Date: 2019-12-10 09:59:43
Message-ID: CAGPqQf2oE--OyXdoXP127+9G97WQ9p0C9m54W4Evdo167zQ+rw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 9, 2019 at 2:52 PM Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
wrote:

>
> Thanks Jeevan for reviewing the patch and offline discussion.
>
> On Mon, Dec 9, 2019 at 11:15 AM Jeevan Chalke <
> jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
>
>>
>>
>> On Fri, Dec 6, 2019 at 12:05 PM Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
>> wrote:
>>
>>>
>>>
>>> On Fri, Dec 6, 2019 at 1:44 AM Robert Haas <robertmhaas(at)gmail(dot)com>
>>> wrote:
>>>
>>>> On Thu, Dec 5, 2019 at 11:22 AM Rushabh Lathia <
>>>> rushabh(dot)lathia(at)gmail(dot)com> wrote:
>>>> > Here is the whole stack of patches.
>>>>
>>>> I committed 0001, as that's just refactoring and I think (hope) it's
>>>> uncontroversial. I think 0002-0005 need to be squashed together
>>>> (crediting all authors properly and in the appropriate order) as it's
>>>> quite hard to understand right now,
>>>
>>>
>>> Please find attached single patch and I tried to add the credit to all
>>> the authors.
>>>
>>
>> I had a look over the patch and here are my few review comments:
>>
>> 1.
>> + if (pg_strcasecmp(manifest_checksum_algo, "SHA256") == 0)
>> + manifest_checksums = MC_SHA256;
>> + else if (pg_strcasecmp(manifest_checksum_algo, "CRC32C") ==
>> 0)
>> + manifest_checksums = MC_CRC32C;
>> + else if (pg_strcasecmp(manifest_checksum_algo, "NONE") == 0)
>> + manifest_checksums = MC_NONE;
>> + else
>> + ereport(ERROR,
>>
>> Is NONE is a valid input? I think the default is "NONE" only and thus no
>> need
>> of this as an input. It will be better if we simply error out if input is
>> neither "SHA256" nor "CRC32C".
>>
>> I believe you have done this way as from pg_basebackup you are always
>> passing
>> MANIFEST_CHECKSUMS '%s' string which will have "NONE" if no user input is
>> given. But I think passing that conditional will be better like we have
>> maxrate_clause for example.
>>
>> Well, this is what I think, feel free to ignore as I don't see any
>> correctness
>> issue over here.
>>
>>
> I would still keep this NONE as it's look more cleaner in the say of
> given options to the checksums.
>
>
>> 2.
>> + if (manifest_checksums != MC_NONE)
>> + {
>> + checksumbuflen = finalize_manifest_checksum(cCtx, checksumbuf);
>> + switch (manifest_checksums)
>> + {
>> + case MC_NONE:
>> + break;
>> + }
>>
>> Since switch case is within "if (manifest_checksums != MC_NONE)"
>> condition,
>> I don't think we need a case for MC_NONE here. Rather we can use a default
>> case to error out.
>>
>>
> Yeah, with the new patch we don't have this part of code.
>
>
>> 3.
>> + if (manifest_checksums != MC_NONE)
>> + {
>> + initialize_manifest_checksum(&cCtx);
>> + update_manifest_checksum(&cCtx, content, len);
>> + }
>>
>> @@ -1384,6 +1641,9 @@ sendFile(const char *readfilename, const char
>> *tarfilename, struct stat *statbuf
>> int segmentno = 0;
>> char *segmentpath;
>> bool verify_checksum = false;
>> + ChecksumCtx cCtx;
>> +
>> + initialize_manifest_checksum(&cCtx);
>>
>>
>> I see that in a few cases you are calling
>> initialize/update_manifest_checksum()
>> conditional and at some other places call is unconditional. It seems like
>> calling unconditional will not have any issues as switch cases inside them
>> return doing nothing when manifest_checksums is MC_NONE.
>>
>>
> Fixed.
>
>
>> 4.
>> initialize/update/finalize_manifest_checksum() functions may be needed by
>> the
>> validation patch as well. And thus I think these functions should not
>> depend
>> on a global variable as such. Also, it will be good if we keep them in a
>> file
>> that is accessible to frontend-only code. Well, you can ignore these
>> comments
>> with the argument saying that this refactoring can be done by the patch
>> adding
>> validation support. I have no issues. Since both the patches are
>> dependent and
>> posted on the same email chain, thought of putting that observation.
>>
>>
> Make sense, I just changed those API to that it doesn't have to
> access the global.
>
>
>> 5.
>> + switch (manifest_checksums)
>> + {
>> + case MC_SHA256:
>> + checksumlabel = "SHA256:";
>> + break;
>> + case MC_CRC32C:
>> + checksumlabel = "CRC32C:";
>> + break;
>> + case MC_NONE:
>> + break;
>> + }
>>
>> This code in AddFileToManifest() is executed for every file for which we
>> are
>> adding an entry. However, the checksumlabel will be going to remain the
>> same
>> throughout. Can it be set just once and then used as is?
>>
>>
> Yeah, with the attached patch we no more have this part of code.
>
>
>> 6.
>> Can we avoid manifest_checksums from declaring it as a global variable?
>> I think for that, we need to pass that to every function and thus need to
>> change the function signature of various functions. Currently, we pass
>> "StringInfo manifest" to all the required function, will it better to pass
>> the struct variable instead? A struct may have members like,
>> "StringInfo manifest" in it, checksum type (manifest_checksums),
>> checksum label, etc.
>>
>>
> I agree. Earlier I was not sure about this because that require data
> structure
> to expose. But in the given attached patch that's what I tried,
> introduced new
> data structure and defined in basebackup.h and passed the same through the
> function so that doesn't require to pass an individual members. Also
> removed
> global manifest_checksum and added the same in the newly introduced
> structure.
>
> Attaching the patch, which need to apply on the top of earlier 0001 patch.
>

Attaching another version of 0002 patch, as my collogue Jeevan Chalke
pointed
few indentation problem in 0002 patch which I sent earlier. Fixed the same
in
the latest patch.

> Thanks,
>
> --
> Rushabh Lathia
> www.EnterpriseDB.com
>

--
Rushabh Lathia

Attachment Content-Type Size
0002_new_data_structure_v2.patch text/x-patch 18.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2019-12-10 10:12:34 Re: Online checksums verification in the backend
Previous Message jiankang liu 2019-12-10 09:52:29 Start Walreceiver completely before shut down it on standby server.