Re: block-level incremental backup

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>
Cc: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: block-level incremental backup
Date: 2019-08-27 11:16:32
Message-ID: CALDaNm15g1XSgSAC0GncYigOuveBUsc92p7_e-SPWFA2t9_GqA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 16, 2019 at 8:07 PM Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com> wrote:
>
> What do you mean by bigger file, a file greater than 1GB? In which case you get file > 1GB?
>
>
>
Few comments:
Comment:
+ buf = (char *) malloc(statbuf->st_size);
+ if (buf == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory")));
+
+ if ((cnt = fread(buf, 1, statbuf->st_size, fp)) > 0)
+ {
+ Bitmapset *mod_blocks = NULL;
+ int nmodblocks = 0;
+
+ if (cnt % BLCKSZ != 0)
+ {

We can use same size as full page size.
After pg start backup full page write will be enabled.
We can use the same file size to maintain data consistency.

Comment:
/* Validate given LSN and convert it into XLogRecPtr. */
+ opt->lsn = pg_lsn_in_internal(strVal(defel->arg), &have_error);
+ if (XLogRecPtrIsInvalid(opt->lsn))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ errmsg("invalid value for LSN")));

Validate input lsn is less than current system lsn.

Comment:
/* Validate given LSN and convert it into XLogRecPtr. */
+ opt->lsn = pg_lsn_in_internal(strVal(defel->arg), &have_error);
+ if (XLogRecPtrIsInvalid(opt->lsn))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ errmsg("invalid value for LSN")));

Should we check if it is same timeline as the system's timeline.

Comment:
+ if (fread(blkdata, 1, BLCKSZ, infp) != BLCKSZ)
+ {
+ pg_log_error("could not read from file \"%s\": %m", outfn);
+ cleanup_filemaps(filemaps, fmindex + 1);
+ exit(1);
+ }
+
+ /* Finally write one block to the output file */
+ if (fwrite(blkdata, 1, BLCKSZ, outfp) != BLCKSZ)
+ {
+ pg_log_error("could not write to file \"%s\": %m", outfn);
+ cleanup_filemaps(filemaps, fmindex + 1);
+ exit(1);
+ }

Should we support compression formats supported by pg_basebackup.
This can be an enhancement after the functionality is completed.

Comment:
We should provide some mechanism to validate the backup. To identify
if some backup is corrupt or some file is missing(deleted) in a
backup.

Comment:
+ ofp = fopen(tofn, "wb");
+ if (ofp == NULL)
+ {
+ pg_log_error("could not create file \"%s\": %m", tofn);
+ exit(1);
+ }

ifp should be closed in the error flow.

Comment:
+ fp = fopen(filename, "r");
+ if (fp == NULL)
+ {
+ pg_log_error("could not read file \"%s\": %m", filename);
+ exit(1);
+ }
+
+ labelfile = pg_malloc(statbuf.st_size + 1);
+ if (fread(labelfile, 1, statbuf.st_size, fp) != statbuf.st_size)
+ {
+ pg_log_error("corrupted file \"%s\": %m", filename);
+ pg_free(labelfile);
+ exit(1);
+ }

fclose can be moved above.

Comment:
+ if (!modifiedblockfound)
+ {
+ copy_whole_file(fm->filename, outfn);
+ cleanup_filemaps(filemaps, fmindex + 1);
+ return;
+ }
+
+ /* Write all blocks to the output file */
+
+ if (fstat(fileno(fm->fp), &statbuf) != 0)
+ {
+ pg_log_error("could not stat file \"%s\": %m", fm->filename);
+ pg_free(filemaps);
+ exit(1);
+ }

Some error flow, cleanup_filemaps need to be called to close the file
descriptors that are opened.

Comment:
+/*
+ * When to send the whole file, % blocks modified (90%)
+ */
+#define WHOLE_FILE_THRESHOLD 0.9
+

This can be user configured value.
This can be an enhancement after the functionality is completed.

Comment:
We can add a readme file with all the details regarding incremental
backup and combine backup.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2019-08-27 12:45:36 Re: "ago" times on buildfarm status page
Previous Message ROS Didier 2019-08-27 10:47:24 PostgreSQL and Real Application Testing (RAT)