Re: block-level incremental backup

From: Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: block-level incremental backup
Date: 2019-08-06 18:37:21
Message-ID: CALtqXTcWY7YwEPOVbw39KO7T+XshaVhQr0e0NzRNvKyo9h8C-Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 6, 2019 at 11:31 PM Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com> wrote:

>
> I have not looked at the patch in detail, but just some nits from my side.
>
> On Fri, Aug 2, 2019 at 6:13 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
>> On Thu, Aug 1, 2019 at 5:06 PM Jeevan Chalke
>> <jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
>> >
>> > On Tue, Jul 30, 2019 at 9:39 AM Jeevan Chalke <
>> jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
>> >>
>> >> I am almost done writing the patch for pg_combinebackup and will post
>> soon.
>> >
>> >
>> > Attached patch which implements the pg_combinebackup utility used to
>> combine
>> > full basebackup with one or more incremental backups.
>> >
>> > I have tested it manually and it works for all best cases.
>> >
>> > Let me know if you have any inputs/suggestions/review comments?
>> >
>> Some comments:
>> 1) There will be some link files created for tablespace, we might
>> require some special handling for it
>>
>> 2)
>> + while (numretries <= maxretries)
>> + {
>> + rc = system(copycmd);
>> + if (rc == 0)
>> + return;
>>
>> Use API to copy the file instead of "system", better to use the secure
> copy.
>
Ah, it is a local copy, simple copy API is enough.

>
>
>> + pg_log_info("could not copy, retrying after %d seconds",
>> + sleeptime);
>> + pg_usleep(numretries++ * sleeptime * 1000000L);
>> + }
>> Retry functionality is hanlded only for copying of full files, should
>> we handle retry for copying of partial files
>>
>> The log and the sleep time does not match, you are multiplying sleeptime
> with numretries++ and logging only "sleeptime"
>
> Why we are retiring here, capture proper copy error and act accordingly.
> Blindly retiring does not make sense.
>
> 3)
>> + maxretries = atoi(optarg);
>> + if (maxretries < 0)
>> + {
>> + pg_log_error("invalid value for maxretries");
>> + fprintf(stderr, _("%s: -r maxretries must be >= 0\n"), progname);
>> + exit(1);
>> + }
>> + break;
>> + case 's':
>> + sleeptime = atoi(optarg);
>> + if (sleeptime <= 0 || sleeptime > 60)
>> + {
>> + pg_log_error("invalid value for sleeptime");
>> + fprintf(stderr, _("%s: -s sleeptime must be between 1 and 60\n"),
>> progname);
>> + exit(1);
>> + }
>> + break;
>> we can have some range for maxretries similar to sleeptime
>>
>> 4)
>> + fp = fopen(filename, "r");
>> + if (fp == NULL)
>> + {
>> + pg_log_error("could not read file \"%s\": %m", filename);
>> + exit(1);
>> + }
>> +
>> + labelfile = malloc(statbuf.st_size + 1);
>> + if (fread(labelfile, 1, statbuf.st_size, fp) != statbuf.st_size)
>> + {
>> + pg_log_error("corrupted file \"%s\": %m", filename);
>> + free(labelfile);
>> + exit(1);
>> + }
>> Should we check for malloc failure
>>
>> Use pg_malloc instead of malloc
>
>
>> 5) Should we add display of progress as backup may take some time,
>> this can be added as enhancement. We can get other's opinion on this.
>>
>> Yes, we should, but this is not the right time to do that.
>
>
>> 6)
>> + if (nIncrDir == MAX_INCR_BK_COUNT)
>> + {
>> + pg_log_error("too many incremental backups to combine");
>> + fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
>> progname);
>> + exit(1);
>> + }
>> +
>> + IncrDirs[nIncrDir] = optarg;
>> + nIncrDir++;
>> + break;
>>
>> If the backup count increases providing the input may be difficult,
>> Shall user provide all the incremental backups from a parent folder
>> and can we handle the ordering of incremental backup internally
>>
>> Why we have that limit at first place?
>
>
>> 7)
>> + if (isPartialFile)
>> + {
>> + if (verbose)
>> + pg_log_info("combining partial file \"%s.partial\"", fn);
>> +
>> + combine_partial_files(fn, IncrDirs, nIncrDir, subdirpath, outfn);
>> + }
>> + else
>> + copy_whole_file(infn, outfn);
>>
>> Add verbose for copying whole file
>>
>> 8) We can also check if approximate space is available in disk before
>> starting combine backup, this can be added as enhancement. We can get
>> other's opinion on this.
>>
>> 9)
>> + printf(_(" -i, --incr-backup=DIRECTORY incremental backup directory
>> (maximum %d)\n"), MAX_INCR_BK_COUNT);
>> + printf(_(" -o, --output-dir=DIRECTORY combine backup into
>> directory\n"));
>> + printf(_("\nGeneral options:\n"));
>> + printf(_(" -n, --no-clean do not clean up after
>> errors\n"));
>>
>> Combine backup into directory can be combine backup directory
>>
>> 10)
>> +/* Max number of incremental backups to be combined. */
>> +#define MAX_INCR_BK_COUNT 10
>> +
>> +/* magic number in incremental backup's .partial file */
>>
>> MAX_INCR_BK_COUNT can be increased little, some applications use 1
>> full backup at the beginning of the month and use 30 incremental
>> backups rest of the days in the month
>>
>> Regards,
>> Vignesh
>> EnterpriseDB: http://www.enterprisedb.com
>>
>>
>>
>
> --
> Ibrar Ahmed
>

--
Ibrar Ahmed

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-08-06 18:42:46 Re: intarray GiST index gets wrong answers for '{}' <@ anything
Previous Message Ibrar Ahmed 2019-08-06 18:31:50 Re: block-level incremental backup