Re: block-level incremental backup

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>
Cc: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: block-level incremental backup
Date: 2019-07-26 07:53:57
Message-ID: CALDaNm01DxcHwZ8f5N7gXv8iGer1jY+i-AuzkS4TxtmRowrLKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 26, 2019 at 11:21 AM Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>
wrote:

> Hi Vignesh,
>
> Please find my comments inline below:
>
> 1) If relation file has changed due to truncate or vacuum.
>> During incremental backup the new files will be copied.
>> There are chances that both the old file and new file
>> will be present. I'm not sure if cleaning up of the
>> old file is handled.
>>
>
> When an incremental backup is taken it either copies the file in its
> entirety if
> a file is changed more than 90%, or writes .partial with changed blocks
> bitmap
> and actual data. For the files that are unchanged, it writes 0 bytes and
> still
> creates a .partial file for unchanged files too. This means there is a
> .partitial
> file for all the files that are to be looked up in full backup.
> While composing a synthetic backup from incremental backup the
> pg_combinebackup
> tool will only look for those relation files in full(parent) backup which
> are
> having .partial files in the incremental backup. So, if vacuum/truncate
> happened
> between full and incremental backup, then the incremental backup image
> will not
> have a 0-length .partial file for that relation, and so the synthetic
> backup
> that is restored using pg_combinebackup will not have that file as well.
>
>
Thanks Jeevan for the update, I feel this logic is good.
It will handle the case of deleting the old relation files.

>
>
>> 2) Just a small thought on building the bitmap,
>> can the bitmap be built and maintained as
>> and when the changes are happening in the system.
>> If we are building the bitmap while doing the incremental backup,
>> Scanning through each file might take more time.
>> This can be a configurable parameter, the system can run
>> without capturing this information by default, but if there are some
>> of them who will be taking incremental backup frequently this
>> configuration can be enabled which should track the modified blocks.
>
>
> IIUC, this will need changes in the backend. Honestly, I think backup is a
> maintenance task and hampering the backend for this does not look like a
> good
> idea. But, having said that even if we have to provide this as a switch
> for some
> of the users, it will need a different infrastructure than what we are
> building
> here for constructing bitmap, where we scan all the files one by one.
> Maybe for
> the initial version, we can go with the current proposal that Robert has
> suggested,
> and add this switch at a later point as an enhancement.
>
>
That sounds fair to me.

Regards,
vignesh
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jehan-Guillaume de Rorthais 2019-07-26 08:02:58 Re: Fetching timeline during recovery
Previous Message Sergei Kornilov 2019-07-26 07:53:03 Re: Add parallelism and glibc dependent only options to reindexdb