Re: block-level incremental backup

From: Ashwin Agrawal <aagrawal(at)pivotal(dot)io>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: block-level incremental backup
Date: 2019-04-10 16:56:42
Message-ID: CALfoeitO-vkfjubMFQRmgyXghL-uUnZLNxbr=obrQQsm8kFO4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 10, 2019 at 9:21 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> I have a related idea, though. Suppose that, as Peter says upthread,
> you have a replication slot that prevents old WAL from being removed.
> You also have a background worker that is connected to that slot. It
> decodes WAL and produces summary files containing all block-references
> extracted from those WAL records and the associated LSN (or maybe some
> approximation of the LSN instead of the exact value, to allow for
> compression and combining of nearby references). Then you hold onto
> those summary files after the actual WAL is removed. Now, when
> somebody asks the server for all blocks changed since a certain LSN,
> it can use those summary files to figure out which blocks to send
> without having to read all the pages in the database. Although I
> believe that a simple system that finds modified blocks by reading
> them all is good enough for a first version of this feature and useful
> in its own right, a more efficient system will be a lot more useful,
> and something like this seems to me to be probably the best way to
> implement it.
>

Not to fork the conversation from incremental backups, but similar approach
is what we have been thinking for pg_rewind. Currently, pg_rewind requires
all the WAL logs to be present on source side from point of divergence to
rewind. Instead just parse the wal and keep the changed blocks around on
sourece. Then don't need to retain the WAL but can still rewind using the
changed block map. So, rewind becomes much similar to incremental backup
proposed here after performing rewind activity using target side WAL only.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2019-04-10 16:59:18 Re: Cleanup/remove/update references to OID column
Previous Message Robert Haas 2019-04-10 16:51:27 Re: block-level incremental backup