Re: [RFC] Incremental backup v2: add backup profile to base backup

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Marco Nenciarini <marco(dot)nenciarini(at)2ndquadrant(dot)it>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [RFC] Incremental backup v2: add backup profile to base backup
Date: 2014-10-03 15:53:48
Message-ID: 542EC68C.9090606@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/03/2014 06:31 PM, Marco Nenciarini wrote:
> Hi Hackers,
>
> I've updated the wiki page
> https://wiki.postgresql.org/wiki/Incremental_backup following the result
> of discussion on hackers.
>
> Compared to first version, we switched from a timestamp+checksum based
> approach to one based on LSN.
>
> This patch adds an option to pg_basebackup and to replication protocol
> BASE_BACKUP command to generate a backup_profile file. It is almost
> useless by itself, but it is the foundation on which we will build the
> file based incremental backup (and hopefully a block based incremental
> backup after it).

I'd suggest jumping straight to block-based incremental backup. It's not
significantly more complicated to implement, and if you implement both
separately, then we'll have to support both forever. If you really need
to, you can implement file-level diff as a special case, where the
server sends all blocks in the file, if any of them have an LSN > the
cutoff point. But I'm not sure if there's point in that, once you have
block-level support.

If we're going to need a profile file - and I'm not convinced of that -
is there any reason to not always include it in the backup?

> Any comment will be appreciated. In particular I'd appreciate comments
> on correctness of relnode files detection and LSN extraction code.

I didn't look at it in detail, but one future problem comes to mind:
Once you implement the server-side code that only sends a file if its
LSN is higher than the cutoff point that the client gave, you'll have to
scan the whole file first, to see if there are any blocks with a higher
LSN. At least until you find the first such block. So with a file-level
implementation of this sort, you'll have to scan all files twice, in the
worst case.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ilya Kosmodemiansky 2014-10-03 15:53:59 Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
Previous Message Robert Haas 2014-10-03 15:51:46 Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)