Re: block-level incremental backup

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: block-level incremental backup
Date: 2019-04-10 19:57:38
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10.04.2019 19:51, Robert Haas wrote:
> On Wed, Apr 10, 2019 at 10:22 AM Konstantin Knizhnik
> <k(dot)knizhnik(at)postgrespro(dot)ru> wrote:
>> Some times ago I have implemented alternative version of ptrack utility
>> (not one used in pg_probackup)
>> which detects updated block at file level. It is very simple and may be
>> it can be sometimes integrated in master.
> I don't think this is completely crash-safe. It looks like it
> arranges to msync() the ptrack file at appropriate times (although I
> haven't exhaustively verified the logic), but it uses MS_ASYNC, so
> it's possible that the ptrack file could get updated on disk either
> before or after the relation file itself. I think before is probably
> OK -- it just risks having some blocks look modified when they aren't
> really -- but after seems like it is very much not OK. And changing
> this to use MS_SYNC would probably be really expensive. Likely a
> better approach would be to hook into the new fsync queue machinery
> that Thomas Munro added to PostgreSQL 12.

I do not think that MS_SYNC or fsync queue is needed here.
If power failure or OS crash cause loose of some writes to ptrack map,
then in any case {ostgres will perform recovery and updating pages from
WAL cause once again marking them in ptrack map. So as in case of CLOG
and many other Postgres files it is not critical to loose some writes
because them will be restored from WAL. And before truncating WAL,
Postgres performs checkpoint which flushes all changes to the disk,
including ptrack map updates.

> It looks like your system maps all the blocks in the system into a
> fixed-size map using hashing. If the number of modified blocks
> between the full backup and the incremental backup is large compared
> to the size of the ptrack map, you'll start to get a lot of
> false-positives. It will look as if much of the database needs to be
> backed up. For example, in your sample configuration, you have
> ptrack_map_size = 1000003. If you've got a 100GB database with 20%
> daily turnover, that's about 2.6 million blocks. If you set bump a
> random entry ~2.6 million times in a map with 1000003 entries, on the
> average ~92% of the entries end up getting bumped, so you will get
> very little benefit from incremental backup. This problem drops off
> pretty fast if you raise the size of the map, but it's pretty critical
> that your map is large enough for the database you've got, or you may
> as well not bother.
This is why ptrack block size should be larger than page size.
Assume that it is 1Mb. 1MB is considered to be optimal amount of disk
IO, when frequent seeks are not degrading read speed (it is most
critical for HDD). In other words reading 10 random pages (20%) from
this 1Mb block will takes almost the same amount of time (or even
longer) than reading all this 1Mb in one operation.

There will be just 100000 used entries in ptrack map with very small
probability of collision.
Actually I have chosen this size (1000003) for ptrack map because with
1Mb block size is allows to map without noticable number of collisions
1Tb database which seems to be enough for most Postgres installations.
But increasing ptrack map size 10 and even 100 times should not also
cause problems with modern RAM sizes.

> It also appears that your system can't really handle resizing of the
> map in any friendly way. So if your data size grows, you may be faced
> with either letting the map become progressively less effective, or
> throwing it out and losing all the data you have.
> None of that is to say that what you're presenting here has no value,
> but I think it's possible to do better (and I think we should try).
Definitely I didn't consider proposed patch as perfect solution and
certainly it requires improvements (and may be complete redesign).
I just want to present this approach (maintaining hash of block's LSN in
mapped memory) and keeping track of modified blocks at file level
(unlike current ptrack implementation which logs changes in all places
in Postgres code where data is updated).

Also, despite to the fact that this patch may be considered as raw
prototype, I have spent some time thinking about all aspects of this
approach including fault tolerance and false positives.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-04-10 20:21:52 Re: pg_dump is broken for partition tablespaces
Previous Message Peter Eisentraut 2019-04-10 19:45:52 Re: PostgreSQL pollutes the file system