Enforcing that all WAL has been replayed after restoring from backup

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Enforcing that all WAL has been replayed after restoring from backup
Date: 2011-08-09 09:00:00
Message-ID: 4E40F710.6000404@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Currently, if you take a backup with "pg_basebackup -x" (which means it
will include all the WAL to required restore within the backup dump),
and hit Ctrl-C while the WAL is being streamed, you end up with a data
directory that you can start postmaster from, even though the backup was
not complete. So what appears to be a valid backup - it starts up fine -
can actually be corrupt.

I put in a check against that back in March, but it had to be reverted
because it broke crash recovery when the system crashed while a
pg_start_backup() based backup was in progress:

http://archives.postgresql.org/message-id/4DA58686.1050501@enterprisedb.com

Here's a patch to add it back in a more fine-grained fashion. The patch
adds an extra line to backup_label, indicating whether the backup was
taken with pg_start/stop_backup(), or by streaming (= pg_basebackup).
For a backup taken with pg_start_backup(), the behavior is kept the same
as it has been - if the end-of-backup record is not reached during crash
recovery, the database starts up anyway. But for a streamed backup, you
get an error at startup.

I think this is a nice additional safeguard to have, making streamed
backups more robust. I'd like to add this to 9.1, but it required an
extra field to be added to the control file, so it would force an
initdb. It's probably not worth that. Or, we could sneak in the extra
boolean field to some currently unused pad space in the ControlFile struct.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
require-backup-end-record-1.patch text/x-diff 5.6 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shigeru Hanada 2011-08-09 10:19:13 Re: psql document fix about showing FDW options
Previous Message Florian Weimer 2011-08-09 07:32:42 Re: libedit memory stomp is apparently fixed in OS X Lion