Docs, backups, and MS VSS

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Docs, backups, and MS VSS
Date: 2016-07-02 14:31:32
Message-ID: CAMsr+YH8o1=f42ofoVYKQVtkmHeH5O16eS76fiXKMGZONX+O4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all

I just noticed that the Pg docs on backups don't discuss what kind of
snapshots are safe for use without a pg_start_backup() and pg_stop_backup()
then copying the extra WAL.

I'd like to remedy that. My understanding is that it's safe to use a
filesystem or block device level snapshot without a pg_start_backup() and
pg_stop_backup() if:

1. The snapshot includes the entire PostgreSQL data directory including all
tablespaces and pg_xlog, i.e. everything is on one filesystem or block
device;

2. The snapshot mechanism guarantees an atomic snapshot, such that every
part of the filesystem or block device is snapshotted consistently at the
same effective moment in time.

This allows PostgreSQL to treat recovery from a snapshot just like recovery
from a crash or hard reset.

I'd like to document these conditions, and note that:

- Microsoft VSS is NOT safe, as it fails point 2. It is atomic only on a
per-file level. You MUST use pg_start_backup() and pg_stop_backup() with
WAL archiving or automated copy of the extra WAL if you use MS VSS. Most
Windows backup products use MS VSS internally. You must ensure they have
dedicated PostgreSQL backup support, using pg_basebackup,
pg_dump/pg_restore, or pg_start_backup()/pg_stop_backup().

- LVM is safe

- BTRFS should be fine

- Most SAN snapshots are fine, but verify with your vendor

I suspect, but cannot prove, that it is also safe to snapshot pg_xlog on a
separate filesystem if and only if you take the datadir snapshot before the
pg_xlog snapshot and you have wal_keep_segments high enough to ensure that
WAL needeed by the redo checkpoint in the datadir snapshot is not removed.
I wouldn't want to do this, and certainly not document it, since it's way
saner to use pg_start_backup() etc.

Reasonable? Will write the SGML if there's broad agreement here that it's
desirable.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-07-02 14:35:48 Re: Statistics Injection
Previous Message Stefan Huehner 2016-07-02 13:10:56 9.6beta2: query failure with 'cache lookup failed for type 0'