PITR for PostgreSQL 7.5+
I'm posting a few thoughts and plans on how to progress PITR. Initially,
I'd suggest chewing through some ideas to get some direction and then
split out some tasks to pursue individually or in teams.
I'm not claiming leadership, infallibility or anything really - I'm more
than happy to work together on these or other ideas for PITR. Any and
all comments or completely alternate views are welcomed.
It seems very likely that this design/plan comes much later than other
designs/thoughts from members of the PGDG community or elsewhere. All
ideas, corrections or offerings of partial code will be gratefully
accepted. There's no competition here - if you do it first and it works,
great. I'll buy you a beer.
I'll update this as information arrives and the picture changes:
PITR means Point-in-Time-Recovery and is an extension to the current
functionality of PostgreSQL. Achieving PITR will be a mixture of OS
features and integrated dbms features. My suggested first priority is to
sort out the latter and not reinvent-the-wheel too much on OS-provided
OVERALL PITR REQUIREMENTS
- A full backup of the database
- A backup of set of time-sequenceable changes that can be used to
"rollforward" from the point that the full backup was taken to the
desired point in time.
- Best practice, possibly offered by backup management software, to
implement both types of backup using a planned & structured approach
SUGGESTED IMPLEMENTATION GOALS
1. Exactly accurate PITR for databases
2. Robustness of code
3. Close to zero performance impact on normal (backed up) system
4. Good overall performance of total recovery procedures to minimise
5. Other normal PostgreSQL code givens: Berkeley licence, portable ANSI
GENERAL PITR RECOVERY SCENARIO
The general PITR recovery scenario requires:
A - Take a full database backup
B - Take regular log file backups
C - Restore a full backup of a database.
If you cannot do this, hide your face in shame.
If it fails, retry it until it works. If it never does, you're out of
luck. If it succeeds, but portions of the database are corrupt, it may
still be possible to continue if non-critical sections of the dbms are
still intact, such as the system tables.
D - Restore a full set of log records.
If you cannot do this, you will have to accept that you have lost data.
If it fails, retry it until it works. If it never does, at least you
have the full backup point, even if that is out of date (slightly).
E - Locate the point in the set of log files that matches the state at
which the full backup was taken.
If this cannot occur or errors, it may be because you have a set of log
files that don't match your full backup. In that case, you must accept
that the full backup you have is the best you've got.
F - Decide somehow what the point in time is you would like to recover
G - Issue commands to define the Point-in-Time to which you would like
H - Rollforward from the point of the backup to the defined PIT.
If this fails because of corrupted log files, then you may choose to
either: try to carry on restoring past any corruption point, or give up
and be happy with however far you've got.
SUGGESTED FEATURES TO BE IMPLEMENTED
My suggestion is that PostgreSQL PITR team should try to provide
facilities to allow B,E,F,G and H.
Partial facilities already exist to achieve:
A - either using pg_dump or via a full OS file copy on a shutdown
G - rollforwad on WAL logs already possible, but need to implement "stop
at point in time logic".
It shouldn't be our role to implement file handling: backup and restore
at a later time can be done many ways, so let the user pick their
preferred method, according to their preferences, data volume, available
We should assume that the system onto which the restore takes place is
completely different from the system on which the backup was taken. We
may need to alter config files prior to starting the restored database.
Any other pre-requisites?
COMPARISON WITH EXISTING TODO ENTRIES
- Add entries to WAL files so it contains enough information for
recovery after a full backup
This is assumed to be complete. [I'm not sure where the role of pg_clog
is in all of this, so this may be a broken assumption.] Comments?
- Write application to archive WAL files to tape, disk, or network
Probably need to do first part, but I'm arguing not to do the copy to
- Determine how the archive writer will interact with the checkpoint
process and WAL file recycling
Required for above
- Write an application that will recover from a pg_dump or tar backup
and then recover WAL files to a specific time
Seems possible, but is possible without this initially, so do other
SUGGESTED PHASING OF IMPLEMENTATION
Here's a suggested phasing of work, to move towards full PITR. Clearly
other ways are also possible...
Phase 0: Planning & Discussion, including writing detailed test plan
Phase 1.1: Implement backup & full restore from OS image (B, E)
Phase 1.2: Implement PITR (add G and modify for H)
Phase 1.3: Implement WAL log inspection facilities (to improve step F)
Phase 2.1: Implement step A/C using pg_dump
Overall, doing this is going to require lots of investigation and
discussion. Most importantly, its going to involve a fair amount of
detailed and fairly intrusive testing. i.e. complete database wipe and
Please don't anyone reading this think I'm going to cut acres of
code...we'll need lots of help all sorts of people, especially from
veteran PostgreSQL developers.
No time plans. We have time to get it right.
FEATURE OVERVIEWS & INVESTIGATIONS/IMPLEMENTATIONS REQUIRED
B - Backing up WAL log files
-Ordinarily, when old log segment files are no longer needed, they are
recycled (renamed to become the next segments in the numbered sequence).
This means that the data within them must be copied from there to
AFTER postgres has closed that file
BEFORE it is renamed and recycled
Spotting that window of opportunity fairly accurately is important. We
will need some mechanism to recognise availability of file for copying,
then lock the file and copy it away.
-what will happen if postgres tries to reuse file while we are still
copying. Does postgres hang, waiting for copy to complete (which may be
a long time if the copy is hanging because of a disk full condition).
This may already be catered for in the code since recycle logic can
handle not-ready-to-recycle conditions (not sure).
-Is it possible to allow read-only queries to continue in this case?
-what will happen if copy fails?
-who will do the copying? Should we use another sub-process of
postmaster to do the copying, or should we try to use a separate program
-what will happen if that process slows down? What will we do if it
-how will we notice that a WAL log is now ready for copying? How will we
communicate that to the archiver process?
-Manual indicates that current WAL format is bulky and would require
some compressed format to be implemented. Initially, I suggest ignoring
this and simply relying of OS or hardware/tape compression methods.
E - Tee up log files to backup state
Should be fairly straightforward when using a full OS file backup of a
correctly shutdown database.
-Investigate use of WAL file names, to see if any problems exist there.
G - Issue command for Recovery point in time
With full OS file backup, if the database is shutdown correctly, then we
will need a way to tell the database "you think you're up to date, but
you're not - I've added some more WAL files into the directories, so
roll forward on those now please".
-Decide when and how to issue command.
-Decide syntax for command?
RECOVER DATABASE TO TIME <TIME> if an SQL statement
Could also implement a new switch on postmaster, to tell it to come up
in recovery mode even though it thinks it doesn't need to?
H - Modify WAL rollfoward
Rollforward until point-in-time should be fairly straightforward.
Current implementation is once-started it will run until it runs out of
files to apply, so simply change the termination test. Where to store
the point-in-time variable? Global? Pass as parameter when WAL
rollforward logic invoked as a command?
A/C - using pg_dump
Following restore from pg_dump, pg_control will not be set at the same
point it was at when the backup was taken. As a result, step E would
incorrectly identify the starting position for the rollforward, which
would either result in a "cannot find correct log" message, or
attempting to rollforward with completely inappropriate WAL data. Some
means of getting round this issue needs to be investigated. Idea: modify
pg_dump to record the transactionid at the start of the pg_dump and have
it generate as the LAST SQL statement in the dump a statement to modify
the system tables (gasp!) to reflect the equivalent state at backup.
Maybe; investigation required. This route may well be complicated by
National Language issues etc, whereas the full backup method will at
least ignore all of that.
What else do we need to do?
What complications are there?
= = = = = = =
In response to
pgsql-hackers by date
|Next:||From: Tom Lane||Date: 2004-02-13 00:00:25|
|Subject: Re: 7.4 - FK constraint performance |
|Previous:||From: scott.marlowe||Date: 2004-02-12 23:59:01|
|Subject: Re: RFC: Query Planner making a distinction between Cross|
pgsql-hackers-pitr by date
|Next:||From: Cott Lang||Date: 2004-02-13 00:48:20|
|Subject: Re: Review of last summer's PITR patch|
|Previous:||From: Mark Kirkwood||Date: 2004-02-12 23:16:41|
|Subject: Re: [HACKERS] PITR Dead horse?|