Re: Proposal: Incremental Backup

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Gabriele Bartolini <gabriele(dot)bartolini(at)2ndquadrant(dot)it>, desmodemone <desmodemone(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Marco Nenciarini <marco(dot)nenciarini(at)2ndquadrant(dot)it>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: Incremental Backup
Date: 2014-08-07 11:11:23
Message-ID: CAHGQGwFM=DB_NujTgoUr3oSqLEmHKNtSCCWZo035o_-bbLUYcQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 7, 2014 at 12:20 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> On Wed, Aug 6, 2014 at 06:48:55AM +0100, Simon Riggs wrote:
>> On 6 August 2014 03:16, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> > On Wed, Aug 6, 2014 at 09:17:35AM +0900, Michael Paquier wrote:
>> >> On Wed, Aug 6, 2014 at 9:04 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> >> >
>> >> > On 5 August 2014 22:38, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
>> >> > Thinking some more, there seems like this whole store-multiple-LSNs
>> >> > thing is too much. We can still do block-level incrementals just by
>> >> > using a single LSN as the reference point. We'd still need a complex
>> >> > file format and a complex file reconstruction program, so I think that
>> >> > is still "next release". We can call that INCREMENTAL BLOCK LEVEL.
>> >>
>> >> Yes, that's the approach taken by pg_rman for its block-level
>> >> incremental backup. Btw, I don't think that the CPU cost to scan all
>> >> the relation files added to the one to rebuild the backups is worth
>> >> doing it on large instances. File-level backup would cover most of the
>> >
>> > Well, if you scan the WAL files from the previous backup, that will tell
>> > you what pages that need incremental backup.
>>
>> That would require you to store that WAL, which is something we hope
>> to avoid. Plus if you did store it, you'd need to retrieve it from
>> long term storage, which is what we hope to avoid.
>
> Well, for file-level backups we have:
>
> 1) use file modtime (possibly inaccurate)
> 2) use file modtime and checksums (heavy read load)
>
> For block-level backups we have:
>
> 3) accumulate block numbers as WAL is written
> 4) read previous WAL at incremental backup time
> 5) read data page LSNs (high read load)
>
> The question is which of these do we want to implement?

There are some data which don't have LSN, for example, postgresql.conf.
When such data has been modified since last backup, they also need to
be included in incremental backup? Probably yes. So implementing only
block-level backup seems not complete solution. It needs file-level backup as
an infrastructure for such data. This makes me think that it's more reasonable
to implement file-level backup first.

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2014-08-07 11:35:53 Re: Proposal: Incremental Backup
Previous Message Fujii Masao 2014-08-07 10:54:01 Re: posix_fadvise() and pg_receivexlog