Re: How safe is pg_basebackup + continuous archiving?

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Kaixi Luo <kaixiluo(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: How safe is pg_basebackup + continuous archiving?
Date: 2016-06-30 13:30:33
Message-ID: 20160630133033.GS21416@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Greetings,

* Kaixi Luo (kaixiluo(at)gmail(dot)com) wrote:
> We use PostgreSQL at work and we do daily backups with pg_dump. After that
> we pg_restore the dump and check the database that there isn't any data
> corruption. As the database grows, the whole pg_dump / pg_restore cycle
> time is quickly approaching 24h, so we need to change strategies.

I've found this to be a good strategy also, but it's far from perfect.
Corruption can still occur, for example, in indexes on the primary
system. Generally speaking, pg_dump doesn't exercise indexes and
therefore you won't notice if an index is corrupt.

> We've thought about using pg_basebackup + continuous archiving as an
> alternative backup process, but I have doubts regarding the safety of such
> procedure. As far as I know, pg_basebackup is done via rsync (and we also
> archive wals using rsync), so if by any chance disk corruption occurs on
> the master server, the corruption would be carried over to our backup
> server.

This is correct, but checksums are now available in modern versions of
PG, which will detect disk corruption. Those checksums would be carried
over to the backup server and could be verified there by using pg_dump
(note that this still wouldn't help with indexes, but you don't have
coverage there today anyway).

> How can we check for backup corruption in this case? Thanks you very much.

There has been some discussion about a specific tool for checking the
checksums throughout the entire system. I don't know of anyone activly
working on that, unfortunately.

There are a number of tools available to help with online backups and
continuous archiving beyond pgbasebackup and having to hand-roll
scripts. I'm personally biased towards and prefer pgBackRest, as I
helped start that project, but there are other tools, such as barman and
WAL-E, which would still be better than trying to implement everything
correctly on your own.

Thanks!

Stephen

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Scott Marlowe 2016-06-30 14:03:44 Re: Replication with non-read-only standby.
Previous Message Mark Morgan Lloyd 2016-06-30 13:18:13 Re: Stored procedure version control