PG 9.0 EBS Snapshot Backups on Slave

From: Andrew Hannon <ahannon(at)fiksu(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: PG 9.0 EBS Snapshot Backups on Slave
Date: 2012-01-24 00:54:16
Message-ID: D24D6FF8-92A4-46F3-B1EE-162168090E9B@fiksu.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,

I am playing with a script that implements physical backups by snapshotting the EBS-backed software RAID. My basic workflow is this:

1. Stop PG on the slave
2. pg_start_backup on the master
3. On the slave:
A. unmount the PG RAID
B. snapshot each disk in the raid
C. mount the PG RAID
4. pg_stop_backup
5. Restart PG on the slave

Step 3 is actually quite fast, however, on the master, I end up seeing the following warning:

WARNING: transaction log file "00000001000000CC00000076" could not be archived: too many failures

I am guessing (I will confirm with timestamps later) this warning happens during steps 3A-3C, however my questions below stand regardless of when this failure occurs.

It is worth noting that, the slave (seemingly) catches up eventually, recovering later log files with streaming replication current. Can I trust this state?

Should I be concerned about this warning? Is it a simple blip that can easily be ignored, or have I lost data? From googling, it looks like retry attempts is not a configurable parameter (it appears to have retried a handful of times).

If this is indeed a real problem, am I best off changing my archive_command to retain logs in a transient location when I am in "snapshot mode", and then ship them in bulk once the snapshot has completed? Are there any other remedies that I am missing?

Thank you very much for your time,

Andrew Hannon

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David Johnston 2012-01-24 01:00:17 Re: Incomplete startup packet help needed
Previous Message David Boreham 2012-01-24 00:46:05 Re: Incomplete startup packet help needed