Re: point in time recovery and moving datafiles online

From: Marc Munro <marc(at)bloodnok(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: point in time recovery and moving datafiles online
Date: 2002-02-22 17:03:57
Message-ID: 1014397437.15016.3.camel@bloodnok.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Tom,
Again, many thanks for the reply.

On Thu, 2002-02-21 at 21:27, Tom Lane wrote:
>
> No, you're missing my point. You don't need intra-file consistency any
> more than you need cross-file consistency. You merely need to be sure
> that you have captured all the state of pages that are not updated
> anywhere in the series of WAL entries that you have.
>
> I had originally started to compose email suggesting that locking on a
> per-disk-page basis (not a per-file basis) would be better, but I do
> not believe you need even that, for two reasons:
>
> 1. PG will always write changes to data files in page-size write
> operations. The Unix kernel guarantees that these writes appear atomic
> from the point of view of other processes. So the data-file-backup
> process will see pagewise consistent data in any case.
>
> 2. Even if the backup process managed to acquire an inconsistent
> (partially updated) copy of a page due to a concurrent write by a
> Postgres backend, we do not care. The WAL activity is designed to
> ensure recovery from partial-page disk writes, and backing up such an
> inconsistent page copy would be isomorphic to a system failure after a
> partial page write. Replay of the WAL will ensure that the page will be
> fully written from the WAL data.

Wow. I hadn't appreciated that. This is way cooler than I realised.

> In short, all you need is a mechanism for archiving off the WAL data and
> locating a checkpoint record in the WAL as a starting point for replay.
> Your data-file backup mechanism can be plain ol' tar or cp -r. No
> interlocks needed or wanted.

So, the whole job is much easier than I thought. This is a good thing.

I will rethink my strategy. It looks like the tasks now are to manage
the archival of WAL files, provide an interface to manage the recovery
process, produce some guidelines/scripts for managing hot backups, and
write the documentation.

--
Marc marc(at)bloodnok(dot)com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Luis Amigo 2002-02-22 17:04:37 Re: How does Index Scan get used
Previous Message Doug McNaught 2002-02-22 17:01:24 Re: vacuum analyze never finishes

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2002-02-22 17:20:29 Re: Replication direction
Previous Message Ned Wolpert 2002-02-22 16:49:58 Re: Replication direction