PostgreSQL Database Media Recovery Summary This document describes a set of options and commands that together provide for database recovery in the event of loss or corruption of one or more PostgreSQL database files. This includes an on line backup facility pg_copy, and a set of configuration options that tell the PostgreSQL DBMS not to re-use recovery logs. Lost or corrupt data is recovered by restoring a backup of the database and then rolling forward all database changes recorded in the recovery logs. Hereafter PostgreSQL's recovery logs are referred to as xlogs as they reside in the pg_xlog directory. Normally the PostgreSQL database manager uses a set of 16 mega-byte xlogs in a round robin fashion to record database change records. After database checkpoints, log records for database changes may overwrite xlogs used earlier. The various roll forward recovery options proposed hereafter, eliminate the overwriting of earlier log data. To perform roll forward recovery means you restore a backup of the database and you copy all archived xlogs taken since that backup into the pg_xlog directory. Then you start the postmaster with the roll forward option and roll forward recovery occurs. Note that in this document database is used synonymously with database cluster. Where a database cluster is the set of databases served by a particular instance of the Postmaster. PG_COPY The copy utility is primarily intended to work with the PostgreSQL roll forward recovery mechanism. Otherwise, pg_dump gives a perfectly good on-line backup of a PostgreSQL database. PG_COPY is described further in pgCopy.doc. Roll Forward Recovery Roll forward recovery replays the changes made to a database over time as recorded by the records in the recovery log (xlog). In the event of a recovery from data loss or corruption all logs are rolled forward from first to last. The roll forward recovery mechanism can also be used to recover a database from the effects of a committed but errant transaction. For example someone inadvertently deletes all the order rows instead of just the orders for customer 100. If you know the time that the errant transaction occurred then you can specify to roll forward to a point in time just before the errant transaction occurred. Enabling Database Roll Forward Recovery You can enable roll forward recovery so that the xlogs are not re-used or so that the xlogs are duplicated. Wal_file_reuse = [ true | false ] The default is false. Use this Postmaster configuration parameter to instruct the database manager to not re- use xlog files. Prior to enabling this parameter you should have made a copy of the database cluster either with the pg_copy utility or copy all the files of the database cluster with an OS utility. Note that a pg_dump file is not a sufficient backup for roll forward recovery. The starting point for a roll forward recovery session must be the restoration of a physical copy of the database. Once the database has been restored then you copy all xlogs filled since that backup into the pg_xlog directory. Roll forward recovery then occurs by starting a postmaster pointed at the root of the restored database cluster. The recovery process redoes all changes in the xlogs starting with the xlog specified in the checkpoint record stored in the pg_control file. This parameter is intended as an option where the pg_xlog directory is mirrored by the system but the database is not. The mirroring protects the xlogs and the xlogs in conjunction with a backup can be used to recover from database loss or corruption. The other situation in which this option might be used is if xlogs are copied to another system as they are filled where they are roll forward into a "hot standby" database cluster. Note that if you don't use pg_copy to backup the database then the postmaster for the database cluster must be stopped before backing up the cluster. Wal_file_duplicate = Use this Postmaster configuration parameter to instruct the database manager to duplex the recovery log. The specification of this option implies that wal_file_reuse is true. Prior to enabling this parameter you should have made a copy of the database cluster either with the pg_copy utility or copy all the files of the database cluster with an OS utility. Note that a pg_dump file is not a sufficient backup for roll forward recovery. The starting point for a roll forward recovery session must be the restoration of a physical copy of the database and then a restore of all xlog files accumulated since the backup into the pg_xlog directory. Roll forward recovery then occurs by starting a postmaster pointed at the root of the restored database cluster. The xlog files in the pg_xlog directory are overwritten as usual but now a duplicate xlog is produced in the duplex_xlog_directory which is never over written. Deleting xlogs Since xlogs are not automatically re-used either in the duplex directory or the pg_xlog directory depending upon the configuration option chosen, when can old logs be deleted? The following system function also can be used to determine which xlog files may be deleted. Functions Wal_file( ) Request_type := [ 'current' | 'last' | 'checkpoint'] Wal_file (current) returns the name of the log file currently being written to. Wal_file(last) returns the name of the last log file filled. Wal_file(checkpoint) returns the name of the file containing the current redo position. The current redo position is the position in the recovery log where crash recovery would start if the system were to crash now. All logs prior to this one will not be needed to recover the database cluster and could be safely removed. Note that this new function will require a catalog version change. Given a database where wal_file_reuse is set to false then the database super-user can use the following command to remove all xlog files prior to the specified file name. ALTER SYSTEM PURGE WAL_FILES The database manager will remove all WAL_FILES less than or equal to log_name unless any of those logs are part of the current checkpoint interval. Note that this command acts on both the primary pg_xlog directory and on the wal_file_duplicate directory if enabled. Rolling Forward To roll forward recover a database, first restore a backup of the database. Then restore into the pg_xlog directory all of the archived xlogs taken since this backup. Then start the postmaster using the –D parameter to point the postmaster at the root directory of the restored database and specify roll_forward = yes. To cause the postmaster to do a roll forward use the configuration parameter: roll_forward = yes This parameter is necessary so that when rolling forward from a restore of a database cluster that was properly closed the startup process knows to execute log redo even though the database cluster is not in a crashed state. Once roll forward is completed the postmaster exits. You can't use the roll_forward parameter to start up a postmaster to allow users to connect to the database. roll_forward_until =