PITR: Proposed modifications to JR's design

From: Patrick Macdonald <patrickm(at)redhat(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: PITR: Proposed modifications to JR's design
Date: 2003-04-30 17:34:01
Message-ID: 3EB00909.5050102@redhat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've been working with JR's PITR patch (certainly not a much as I would
have liked, sorry) and would like to discuss the following proposed
modifications with the group.

1. Syntax

JR's patch did not include the notion of forward recovery to a specific
point in time (just to the end of logs). I intend to alter the proposed
syntax to accept a user defined date/time combination. JR did mention
this as an add-on but I believe it's important enough to include from
the start.

Also, the date/time of the last encountered commit log record will be
returned to the user at the end of any forward recovery iteration.

2. PITR End Log Extent Handling

Currently, when the user has decided to end the forward recovery phase,
the log extent that is currently being read is truncated. I talked
with Tom Lane and we thought it might be a good idea not to truncate
the file but to leave the extent alone and remember the location/lsn
where we left off. This "remembering mechanism" would be a proposed
extent header which would contain chaining information and is
discussed in Point 3: On-disk Modifications. This would leave the
last log extent untouched allowing reuse in future forward recovery
scenarios. With the proposed change, the next log extent created/used
would be named off-sequence by some factor (even 1 if so inclined).

Let's look at an example. Say I'm using log extents 1,2,3,4,5 and
want to recover to a point in extent 3.

1 -> 2 -> 3 -> 4 -> 5
\-----> 13 -> 14

When we have reached the identified point in extent 3, we create a
new log extent (13) and record our current extent 3 location/lsn
in extent 13's header and back-chain the other header information to
extent 3. Log extent 3 remains unchanged (which is especially
important if this is the only copy of extent 3). If the user wishes
to restore the old backup and forward recover to a point in extent 4,
the logs are in a state to allow this functionality. There are now
two series/chains of log extents possible to forward recover through
without the destruction of log content. A downside to this solution
is that the header of the next extent must be read in to determine
where to stop reading log records in the current log extent.

3. On-disk Modifications

Instead of issuing a new type of log record when we move to a new
extent, we use the first block/page of an extent as an extent header.
This header would contain cluster and file identification as well as
chaining information. The chaining information would include the
unique identifier of the prior log extent, the location of the last
read log record (if the last extent contained an ending point of a
relevant forward recovery situation) and some housekeeping
(timestamps, etc).

The pg_control file additions would include JR's unique cluster
identifier, a flag identifying the type of logging within the cluster
(circular or archival), and a small forward recovery section to include
backup information (log extent number/lsns/timestamps). This backup
information can then be used during the forward recovery phase to
indicate the starting point (log extent/lsn) and the ending point
(timestamp) required to bring the cluster into a consistent state.

Comments/concerns/gotchas appreciated.

Cheers,
Patrick

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-04-30 19:14:00 Planned changes in backend memory management
Previous Message Zeugswetter Andreas SB SD 2003-04-30 15:49:02 Re: Trx issues: SELECT FOR UPDATE LIMIT