Re: point in time recovery and moving datafiles online

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marc Munro <marc(at)bloodnok(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: point in time recovery and moving datafiles online
Date: 2002-02-22 05:27:56
Message-ID: 14433.1014355676@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Marc Munro <marc(at)bloodnok(dot)com> writes:
> On Thu, 2002-02-21 at 19:52, Tom Lane wrote:
>> It seems to me that you can get the desired results without any
>> locking. Assume that you start archiving the WAL just after a
>> checkpoint record. Also, start copying data files to your backup
>> medium. Some not inconsiderable time later, you are done copying
>> data files. You continue copying off and archiving WAL entries.
>> You cannot say that the copied data files correspond to any particular
>> point in the WAL, or that they form a consistent set of data at all
>> --- but if you were to reload them and replay the WAL into them
>> starting from the checkpoint, then you *would* have a consistent set
>> of files once you reached the point in the WAL corresponding to the
>> end-time of the data file backup. You could stop there, or continue
>> WAL replay to any later point in time.

> If I understand you correctly this is exactly what I was thinking, based
> on Oracle recovery. But we must still prevent writes to each data file
> as we back it up, so that it remains internally consistent.

No, you're missing my point. You don't need intra-file consistency any
more than you need cross-file consistency. You merely need to be sure
that you have captured all the state of pages that are not updated
anywhere in the series of WAL entries that you have.

I had originally started to compose email suggesting that locking on a
per-disk-page basis (not a per-file basis) would be better, but I do
not believe you need even that, for two reasons:

1. PG will always write changes to data files in page-size write
operations. The Unix kernel guarantees that these writes appear atomic
from the point of view of other processes. So the data-file-backup
process will see pagewise consistent data in any case.

2. Even if the backup process managed to acquire an inconsistent
(partially updated) copy of a page due to a concurrent write by a
Postgres backend, we do not care. The WAL activity is designed to
ensure recovery from partial-page disk writes, and backing up such an
inconsistent page copy would be isomorphic to a system failure after a
partial page write. Replay of the WAL will ensure that the page will be
fully written from the WAL data.

In short, all you need is a mechanism for archiving off the WAL data and
locating a checkpoint record in the WAL as a starting point for replay.
Your data-file backup mechanism can be plain ol' tar or cp -r. No
interlocks needed or wanted.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bruce Momjian 2002-02-22 05:51:35 Financial support for replication programmer needed
Previous Message Marc Munro 2002-02-22 05:07:27 Re: point in time recovery and moving datafiles online

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2002-02-22 05:42:36 Re: Are stored procedures pre-compiled?
Previous Message Bruce Momjian 2002-02-22 05:22:02 Re: Are stored procedures pre-compiled?