From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Marco Nenciarini <marco(dot)nenciarini(at)2ndquadrant(dot)it> |
Cc: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [RFC] Incremental backup v2: add backup profile to base backup |
Date: | 2014-10-06 15:50:07 |
Message-ID: | CA+TgmoYdG1JvymERkGozpfazJBHTNbxSAvWMHGmK7dRioP8bAQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Oct 6, 2014 at 11:33 AM, Marco Nenciarini
<marco(dot)nenciarini(at)2ndquadrant(dot)it> wrote:
>> 1. Take a full backup. Basically, we already have this. In the
>> backup label file, make sure to note the newest LSN guaranteed to be
>> present in the backup.
>
> Don't we already have it in "START WAL LOCATION"?
Yeah, probably. I was too lazy to go look for it, but that sounds
like the right thing.
>> 2. Take a differential backup. In the backup label file, note the LSN
>> of the fullback to which the differential backup is relative, and the
>> newest LSN guaranteed to be present in the differential backup. The
>> actual backup can consist of a series of 20-byte buffer tags, those
>> being the exact set of blocks newer than the base-backup's
>> latest-guaranteed-to-be-present LSN. Each buffer tag is followed by
>> an 8kB block of data. If a relfilenode is truncated or removed, you
>> need some way to indicate that in the backup; e.g. include a buffertag
>> with forknum = -(forknum + 1) and blocknum = the new number of blocks,
>> or InvalidBlockNumber if removed entirely.
>
> To have a working backup you need to ship each block which is newer than
> latest-guaranteed-to-be-present in full backup and not newer than
> latest-guaranteed-to-be-present in the current backup. Also, as a
> further optimization, you can think about not sending the empty space in
> the middle of each page.
Right. Or compressing the data.
> My main concern here is about how postgres can remember that a
> relfilenode has been deleted, in order to send the appropriate "deletion
> tag".
You also need to handle truncation.
> IMHO the easiest way is to send the full list of files along the backup
> and let to the client the task to delete unneeded files. The backup
> profile has this purpose.
>
> Moreover, I do not like the idea of using only a stream of block as the
> actual differential backup, for the following reasons:
>
> * AFAIK, with the current infrastructure, you cannot do a backup with a
> block stream only. To have a valid backup you need many files for which
> the concept of LSN doesn't apply.
>
> * I don't like to have all the data from the various
> tablespace/db/whatever all mixed in the same stream. I'd prefer to have
> the blocks saved on a per file basis.
OK, that makes sense. But you still only need the file list when
sending a differential backup, not when sending a full backup. So
maybe a differential backup looks like this:
- Ship a table-of-contents file with a list relation files currently
present and the length of each in blocks.
- For each block that's been modified since the original backup, ship
a file called delta_<original file name> which is of the form <block
number><changed block contents> [...].
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Marti Raudsepp | 2014-10-06 15:51:06 | Re: Add generate_series(numeric, numeric) |
Previous Message | Andres Freund | 2014-10-06 15:42:08 | Re: Inefficient barriers on solaris with sun cc |