Re: backup manifests

From: Suraj Kharage <suraj(dot)kharage(at)enterprisedb(dot)com>
To: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>
Subject: Re: backup manifests
Date: 2019-11-25 09:35:22
Message-ID: CAF1DzPWy1bT3OfxConS7KzihwbjAX_4-u4ws3epyMsq0jek-Xg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Jeevan,

I have incorporated all the comments in the attached patch. Please review
and let me know your thoughts.

On Thu, Nov 21, 2019 at 2:51 PM Jeevan Chalke <
jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:

>
>
> On Wed, Nov 20, 2019 at 11:05 AM Suraj Kharage <
> suraj(dot)kharage(at)enterprisedb(dot)com> wrote:
>
>> Hi,
>>
>> Since now we are generating the backup manifest file with each backup, it
>> provides us an option to validate the given backup.
>> Let's say, we have taken a backup and after a few days, we want to check
>> whether that backup is validated or corruption-free without restarting the
>> server.
>>
>> Please find attached POC patch for same which will be based on the latest
>> backup manifest patch from Rushabh. With this functionality, we add new
>> option to pg_basebackup, something like --verify-backup.
>> So, the syntax would be:
>> ./bin/pg_basebackup --verify-backup -D <backup_directory_path>
>>
>> Basically, we read the backup_manifest file line by line from the given
>> directory path and build the hash table, then scan the directory and
>> compare each file with the hash entry.
>>
>> Thoughts/suggestions?
>>
>
>
> I like the idea of verifying the backup once we have backup_manifest with
> us.
> Periodically verifying the already taken backup with this simple tool
> becomes
> easy now.
>
> I have reviewed this patch and here are my comments:
>
> 1.
> @@ -30,7 +30,9 @@
> #include "common/file_perm.h"
> #include "common/file_utils.h"
> #include "common/logging.h"
> +#include "common/sha2.h"
> #include "common/string.h"
> +#include "fe_utils/simple_list.h"
> #include "fe_utils/recovery_gen.h"
> #include "fe_utils/string_utils.h"
> #include "getopt_long.h"
> @@ -38,12 +40,19 @@
> #include "pgtar.h"
> #include "pgtime.h"
> #include "pqexpbuffer.h"
> +#include "pgrhash.h"
> #include "receivelog.h"
> #include "replication/basebackup.h"
> #include "streamutil.h"
>
> Please add new files in order.
>
> 2.
> Can hash related file names be renamed to backuphash.c and backuphash.h?
>
> 3.
> Need indentation adjustments at various places.
>
> 4.
> + char buf[1000000]; // 1MB chunk
>
> It will be good if we have multiple of block /page size (or at-least power
> of 2
> number).
>
> 5.
> +typedef struct pgrhash_entry
> +{
> + struct pgrhash_entry *next; /* link to next entry in same bucket */
> + DataDirectoryFileInfo *record;
> +} pgrhash_entry;
> +
> +struct pgrhash
> +{
> + unsigned nbuckets; /* number of buckets */
> + pgrhash_entry **bucket; /* pointer to hash entries */
> +};
> +
> +typedef struct pgrhash pgrhash;
>
> These two can be moved to .h file instead of redefining over there.
>
> 6.
> +/*
> + * TODO: this function is not necessary, can be removed.
> + * Test whether the given row number is match for the supplied keys.
> + */
> +static bool
> +pgrhash_compare(char *bt_filename, char *filename)
>
> Yeah, it can be removed by doing strcmp() at the required places rather
> than
> doing it in a separate function.
>
> 7.
> mdate is not compared anywhere. I understand that it can't be compared with
> the file in the backup directory and its entry in the manifest as manifest
> entry gives mtime from server file whereas the same file in the backup will
> have different mtime. But adding a few comments there will be good.
>
> 8.
> + char mdate[24];
>
> should be mtime instead?
>
>
> Thanks
>
> --
> Jeevan Chalke
> Associate Database Architect & Team Lead, Product Development
> EnterpriseDB Corporation
> The Enterprise PostgreSQL Company
>
>

--
--

Thanks & Regards,
Suraj kharage,
EnterpriseDB Corporation,
The Postgres Database Company.

Attachment Content-Type Size
backup_validator_POC.patch application/octet-stream 16.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2019-11-25 09:37:32 Re: adding partitioned tables to publications
Previous Message Antonin Houska 2019-11-25 09:02:00 Re: Attempt to consolidate reading of XLOG page