Re: block-level incremental backup

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: block-level incremental backup
Date: 2019-09-16 08:31:21
Message-ID: CAA4eK1JeaiGQCSA4QsokRUmDxqBq9PRcC8CSem0GsvKQ9S5JFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 16, 2019 at 7:22 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Thu, Sep 12, 2019 at 9:13 AM Jeevan Chalke
> <jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
> > I had a look over this issue and observed that when the new database is
> > created, the catalog files are copied as-is into the new directory
> > corresponding to a newly created database. And as they are just copied,
> > the LSN on those pages are not changed. Due to this incremental backup
> > thinks that its an existing file and thus do not copy the blocks from
> > these new files, leading to the failure.
>
> *facepalm*
>
> Well, this shoots a pretty big hole in my design for this feature. I
> don't know why I didn't think of this when I wrote out that design
> originally. Ugh.
>
> Unless we change the way that CREATE DATABASE and any similar
> operations work so that they always stamp pages with new LSNs, I think
> we have to give up on the idea of being able to take an incremental
> backup by just specifying an LSN.
>

This seems to be a blocking problem for the LSN based design. Can we
think of using creation time for file? Basically, if the file
creation time is later than backup-labels "START TIME:", then include
that file entirely. I think one big point against this is clock skew
like what if somebody tinkers with the clock. And also, this can
cover cases like
what Jeevan has pointed but might not cover other cases which we found
problematic.

> We'll instead need to get a list of
> files from the server first, and then request the entirety of any that
> we don't have, plus the changed blocks from the ones that we do have.
> I guess that will make Stephen happy, since it's more like the design
> he wanted originally (and should generalize more simply to parallel
> backup).
>
> One question I have is: is there any scenario in which an existing
> page gets modified after the full backup and before the incremental
> backup but does not end up with an LSN that follows the full backup's
> start LSN?
>

I think the operations covered by WAL flag XLR_SPECIAL_REL_UPDATE will
have similar problems.

One related point is how do incremental backups handle the case where
vacuum truncates the relation partially? Basically, with current
patch/design, it doesn't appear that such information can be passed
via incremental backup. I am not sure if this is a problem, but it
would be good if we can somehow handle this.

Isn't some operations where at the end we directly call heap_sync
without writing WAL will have a similar problem as well? Similarly,
it is not very clear if unlogged relations are handled in some way if
not, the same could be documented.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-09-16 10:18:24 Re: refactoring - share str2*int64 functions
Previous Message Tattsu Yama 2019-09-16 06:26:10 Re: [HACKERS] CLUSTER command progress monitor