Re: finding changed blocks using WAL scanning

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: finding changed blocks using WAL scanning
Date: 2019-04-11 13:27:19
Message-ID: CA+TgmoZH_eRokEq9Dc2GgWh6UvWGf5rGYNMQTwJutVMehDjhNg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 11, 2019 at 3:52 AM Peter Eisentraut
<peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
> I had in mind that you could have different overlapping incremental
> backup jobs in existence at the same time. Maybe a daily one to a
> nearby disk and a weekly one to a faraway cloud. Each one of these
> would need a separate replication slot, so that the information that is
> required for *that* incremental backup series is preserved between runs.
> So just one reserved replication slot that feeds the block summaries
> wouldn't work. Perhaps what would work is a flag on the replication
> slot itself "keep block summaries for this slot". Then when all the
> slots with the block summary flag are past an LSN, you can clean up the
> summaries before that LSN.

I don't think that quite works. There are two different LSNs. One is
the LSN of the oldest WAL archive that we need to keep around so that
it can be summarized, and the other is the LSN of the oldest summary
we need to keep around so it can be used for incremental backup
purposes. You can't keep both of those LSNs in the same slot.
Furthermore, the LSN stored in the slot is defined as the amount of
WAL we need to keep, not the amount of something else (summaries) that
we need to keep. Reusing that same field to mean something different
sounds inadvisable.

In other words, I think there are two problems which we need to
clearly separate: one is retaining WAL so we can generate summaries,
and the other is retaining summaries so we can generate incremental
backups. Even if we solve the second problem by using some kind of
replication slot, we still need to solve the first problem somehow.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2019-04-11 13:28:04 Re: Issue in ExecCleanupTupleRouting()
Previous Message Michael Paquier 2019-04-11 13:27:04 Re: REINDEX CONCURRENTLY 2.0