Re: How safe is pg_basebackup + continuous archiving?

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Kaixi Luo <kaixiluo(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-general(at)postgresql(dot)org>
Subject: Re: How safe is pg_basebackup + continuous archiving?
Date: 2016-06-30 00:17:46
Message-ID: CAB7nPqS8qdD+aV7KLKVKm2rpDWQvJYNYm4B+q0ebzAfF7LXSTA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Jun 29, 2016 at 11:51 PM, Kaixi Luo <kaixiluo(at)gmail(dot)com> wrote:
> We use PostgreSQL at work and we do daily backups with pg_dump. After that
> we pg_restore the dump and check the database that there isn't any data
> corruption. As the database grows, the whole pg_dump / pg_restore cycle time
> is quickly approaching 24h, so we need to change strategies.

That's a mature solution. And without doubts, many systems in
production use it, abuse of it and rely on it.

> We've thought about using pg_basebackup + continuous archiving as an
> alternative backup process, but I have doubts regarding the safety of such
> procedure. As far as I know, pg_basebackup is done via rsync (and we also
> archive wals using rsync), so if by any chance disk corruption occurs on
> the master server, the corruption would be carried over to our backup
> server.

pg_basebackup speaks the replication protocol and uses it to receive a
base backup from the server in the shape of a tar stream. It then
decides if it needs to untar the content or write it to disk as-is.
Note though that the contents of a backup are not fsync'd to disk
after pg_basebackup finished (that's in the works), so you had better
do it as well to ensure that the data stays here.

> How can we check for backup corruption in this case? Thanks you very much.

Before replaying a backup on a production system, you would need a
pre-production setup where the backup is replayed and checked.
Honestly, you can only be sure that a backup is working correctly
after reusing it. You could always do some validation of the raw
backup contents, but you need at the end the WAL applied on top of it
to be able to check the status of a server that has reached a
consistent point.
--
Michael

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Mark Morgan Lloyd 2016-06-30 09:40:30 Re: Stored procedure version control
Previous Message Neil Anderson 2016-06-29 18:46:49 Re: Stored procedure version control