Index: doc/src/sgml/ref/pg_resetxlog.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/ref/pg_resetxlog.sgml,v
retrieving revision 1.13
diff -c -c -r1.13 pg_resetxlog.sgml
*** doc/src/sgml/ref/pg_resetxlog.sgml 25 Apr 2006 21:02:33 -0000 1.13
--- doc/src/sgml/ref/pg_resetxlog.sgml 26 Apr 2006 02:14:43 -0000
***************
*** 20,25 ****
--- 20,26 ----
pg_resetxlog
-f
-n
+ -r
-ooid
-x xid
-m mxid
***************
*** 57,78 ****
If pg_resetxlog complains that it cannot determine
! valid data for pg_control>, you can force it to proceed anyway
! by specifying the -f> (force) switch. In this case plausible
! values will be substituted for the missing data. Most of the fields can be
! expected to match, but manual assistance may be needed for the next OID,
! next transaction ID, next multitransaction ID and offset,
! WAL starting address, and database locale fields.
! The first five of these can be set using the switches discussed below.
! pg_resetxlog's own environment is the source for its
! guess at the locale fields; take care that LANG> and so forth
! match the environment that initdb> was run in.
! If you are not able to determine correct values for all these fields,
! -f> can still be used, but
the recovered database must be treated with even more suspicion than
! usual: an immediate dump and reload is imperative. Do not>
! execute any data-modifying operations in the database before you dump;
! as any such action is likely to make the corruption worse.
--- 58,79 ----
If pg_resetxlog complains that it cannot determine
! valid data for pg_control>, you can force it to proceed
! anyway by specifying the -f> (force) switch. In this case
! plausible values will be substituted for the missing data.
! pg_resetxlog's own environment is the source for
! its guess at the locale fields; take care that LANG> and so
! forth match the environment that initdb> was run in.
! /xlog> files are used to determine other parameters, like
! next OID, next transaction ID, next multi-transaction ID and offset,
! WAL starting address, and database locale fields. Because determined
! values might be wrong, the first five of these can be set using the
! switches discussed below. If you are not able to determine correct
! values for all these fields, -f> can still be used, but
the recovered database must be treated with even more suspicion than
! usual: an immediate dump and reload is imperative. Do
! not> execute any data-modifying operations in the database before
! you dump; as any such action is likely to make the corruption worse.
***************
*** 150,155 ****
--- 151,161 ----
+ The -r> restores pg_control> counters listed
+ above without resetting the write-ahead log.
+
+
+
The -n> (no operation) switch instructs
pg_resetxlog to print the values reconstructed from
pg_control> and then exit without modifying anything.
Index: src/bin/pg_resetxlog/pg_resetxlog.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_resetxlog/pg_resetxlog.c,v
retrieving revision 1.43
diff -c -c -r1.43 pg_resetxlog.c
*** src/bin/pg_resetxlog/pg_resetxlog.c 5 Apr 2006 03:34:05 -0000 1.43
--- src/bin/pg_resetxlog/pg_resetxlog.c 26 Apr 2006 02:14:48 -0000
***************
*** 4,29 ****
* A utility to "zero out" the xlog when it's corrupt beyond recovery.
* Can also rebuild pg_control if needed.
*
! * The theory of operation is fairly simple:
* 1. Read the existing pg_control (which will include the last
* checkpoint record). If it is an old format then update to
* current format.
! * 2. If pg_control is corrupt, attempt to intuit reasonable values,
! * by scanning the old xlog if necessary.
* 3. Modify pg_control to reflect a "shutdown" state with a checkpoint
* record at the start of xlog.
* 4. Flush the existing xlog files and write a new segment with
* just a checkpoint record in it. The new segment is positioned
* just past the end of the old xlog, so that existing LSNs in
* data pages will appear to be "in the past".
- * This is all pretty straightforward except for the intuition part of
- * step 2 ...
*
! *
! * Portions Copyright (c) 1996-2006, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
- * $PostgreSQL: pgsql/src/bin/pg_resetxlog/pg_resetxlog.c,v 1.43 2006/04/05 03:34:05 tgl Exp $
*
*-------------------------------------------------------------------------
*/
--- 4,32 ----
* A utility to "zero out" the xlog when it's corrupt beyond recovery.
* Can also rebuild pg_control if needed.
*
! * The theory of reset operation is fairly simple:
* 1. Read the existing pg_control (which will include the last
* checkpoint record). If it is an old format then update to
* current format.
! * 2. If pg_control is corrupt, attempt to rebuild the values,
! * by scanning the old xlog; if it fail then try to guess it.
* 3. Modify pg_control to reflect a "shutdown" state with a checkpoint
* record at the start of xlog.
* 4. Flush the existing xlog files and write a new segment with
* just a checkpoint record in it. The new segment is positioned
* just past the end of the old xlog, so that existing LSNs in
* data pages will appear to be "in the past".
*
! * The algorithm of restoring the pg_control value from old xlog file:
! * 1. Retrieve all of the active xlog files from xlog direcotry into a list
! * by increasing order, according their timeline, log id, segment id.
! * 2. Search the list to find the oldest xlog file of the lastest time line.
! * 3. Search the records from the oldest xlog file of latest time line
! * to the latest xlog file of latest time line, if the checkpoint record
! * has been found, update the latest checkpoint and previous checkpoint.
! * Portions Copyright (c) 1996-2005, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
*
*-------------------------------------------------------------------------
*/
***************
*** 46,51 ****
--- 49,57 ----
#include "catalog/catversion.h"
#include "catalog/pg_control.h"
+ #define GUESS 0
+ #define WAL 1
+
extern int optind;
extern char *optarg;
***************
*** 53,75 ****
static ControlFileData ControlFile; /* pg_control values */
static uint32 newXlogId,
newXlogSeg; /* ID/Segment of new XLOG segment */
- static bool guessed = false; /* T if we had to guess at any values */
static const char *progname;
static bool ReadControlFile(void);
! static void GuessControlValues(void);
! static void PrintControlValues(bool guessed);
static void RewriteControlFile(void);
static void KillExistingXLOG(void);
static void WriteEmptyXLOG(void);
static void usage(void);
int
main(int argc, char *argv[])
{
int c;
bool force = false;
bool noupdate = false;
TransactionId set_xid = 0;
Oid set_oid = 0;
--- 59,133 ----
static ControlFileData ControlFile; /* pg_control values */
static uint32 newXlogId,
newXlogSeg; /* ID/Segment of new XLOG segment */
static const char *progname;
+ static uint64 sysidentifier=-1;
+
+ /*
+ * We use a list to store the active xlog files we had found in the
+ * xlog directory in increasing order according the time line, logid,
+ * segment id.
+ *
+ */
+ typedef struct XLogFileName {
+ TimeLineID tli;
+ uint32 logid;
+ uint32 seg;
+ char fname[256];
+ struct XLogFileName *next;
+ } XLogFileName;
+
+ /* The list head */
+ static XLogFileName *xlogfilelist=NULL;
+
+ /* LastXLogfile is the latest file in the latest time line,
+ CurXLogfile is the oldest file in the lastest time line
+ */
+ static XLogFileName *CurXLogFile, *LastXLogFile;
+
+ /* The last checkpoint found in xlog file.*/
+ static CheckPoint lastcheckpoint;
+
+ /* The last and previous checkpoint pointers found in xlog file.*/
+ static XLogRecPtr prevchkp, lastchkp;
+
+ /* the database state.*/
+ static DBState state=DB_SHUTDOWNED;
+
+ /* the total checkpoint numbers which had been found in the xlog file.*/
+ static int found_checkpoint=0;
+
static bool ReadControlFile(void);
! static bool RestoreControlValues(int mode);
! static void PrintControlValues(void);
! static void UpdateCtlFile4Reset(void);
static void RewriteControlFile(void);
static void KillExistingXLOG(void);
static void WriteEmptyXLOG(void);
static void usage(void);
+ static void GetXLogFiles(void);
+ static bool ValidXLogFileName(char * fname);
+ static bool ValidXLogFileHeader(XLogFileName *segfile);
+ static bool ValidXLOGPageHeader(XLogPageHeader hdr, uint tli, uint id, uint seg);
+ static bool CmpXLogFileOT(XLogFileName * f1, XLogFileName *f2);
+ static bool IsNextSeg(XLogFileName *prev, XLogFileName *cur);
+ static void InsertXLogFile( char * fname );
+ static bool ReadXLogPage(void);
+ static bool RecordIsValid(XLogRecord *record, XLogRecPtr recptr);
+ static bool FetchRecord(void);
+ static void UpdateCheckPoint(XLogRecord *record);
+ static void SelectStartXLog(void);
+ static int SearchLastCheckpoint(void);
+ static int OpenXLogFile(XLogFileName *sf);
+ static void CleanUpList(XLogFileName *list);
int
main(int argc, char *argv[])
{
int c;
bool force = false;
+ bool restore = false;
bool noupdate = false;
TransactionId set_xid = 0;
Oid set_oid = 0;
***************
*** 84,90 ****
char *DataDir;
int fd;
char path[MAXPGPATH];
!
set_pglocale_pgservice(argv[0], "pg_resetxlog");
progname = get_progname(argv[0]);
--- 142,150 ----
char *DataDir;
int fd;
char path[MAXPGPATH];
! bool ctlcorrupted = false;
! bool PidLocked = false;
!
set_pglocale_pgservice(argv[0], "pg_resetxlog");
progname = get_progname(argv[0]);
***************
*** 104,117 ****
}
! while ((c = getopt(argc, argv, "fl:m:no:O:x:")) != -1)
{
switch (c)
{
case 'f':
force = true;
break;
!
case 'n':
noupdate = true;
break;
--- 164,181 ----
}
! while ((c = getopt(argc, argv, "fl:m:no:O:x:r")) != -1)
{
switch (c)
{
case 'f':
force = true;
break;
!
! case 'r':
! restore = true;
! break;
!
case 'n':
noupdate = true;
break;
***************
*** 255,271 ****
}
else
{
! fprintf(stderr, _("%s: lock file \"%s\" exists\n"
! "Is a server running? If not, delete the lock file and try again.\n"),
! progname, path);
! exit(1);
}
/*
* Attempt to read the existing pg_control file
*/
if (!ReadControlFile())
! GuessControlValues();
/*
* Adjust fields if required by switches. (Do this now so that printout,
--- 319,335 ----
}
else
{
! PidLocked = true;
}
/*
* Attempt to read the existing pg_control file
*/
if (!ReadControlFile())
! {
! /* The control file has been corruptted.*/
! ctlcorrupted = true;
! }
/*
* Adjust fields if required by switches. (Do this now so that printout,
***************
*** 294,319 ****
ControlFile.logSeg = minXlogSeg;
}
/*
! * If we had to guess anything, and -f was not given, just print the
! * guessed values and exit. Also print if -n is given.
*/
! if ((guessed && !force) || noupdate)
{
! PrintControlValues(guessed);
! if (!noupdate)
{
! printf(_("\nIf these values seem acceptable, use -f to force reset.\n"));
! exit(1);
! }
! else
exit(0);
}
/*
* Don't reset from a dirty pg_control without -f, either.
*/
! if (ControlFile.state != DB_SHUTDOWNED && !force)
{
printf(_("The database server was not shut down cleanly.\n"
"Resetting the transaction log may cause data to be lost.\n"
--- 358,438 ----
ControlFile.logSeg = minXlogSeg;
}
+ /* retore the broken control file from WAL file.*/
+ if (restore)
+ {
+
+ /* If the control fine is fine, don't touch it.*/
+ if ( !ctlcorrupted )
+ {
+ printf(_("\nThe control file seems fine, not need to restore it.\n"));
+ printf(_("If you want to restore it anyway, use -f option, but this also will reset the log file.\n"));
+ exit(0);
+ }
+
+
+ /* Try to restore control values from old xlog file, or complain it.*/
+ if (RestoreControlValues(WAL))
+ {
+ /* Success in restoring the checkpoint information from old xlog file.*/
+
+ /* Print it out.*/
+ PrintControlValues();
+
+ /* In case the postmaster is crashed.
+ * But it may be dangerous for the living one.
+ * It may need a more good way.
+ */
+ if (PidLocked)
+ {
+ ControlFile.state = DB_IN_PRODUCTION;
+ }
+ /* Write the new control file. */
+ RewriteControlFile();
+ printf(_("\nThe control file had been restored.\n"));
+ }
+ else
+ {
+ /* Fail in restoring the checkpoint information from old xlog file. */
+ printf(_("\nCan not restore the control file from XLog file..\n"));
+ printf(_("\nIf you want to restore it anyway, use -f option to guess the information, but this also will reset the log file.\n"));
+ }
+
+ exit(0);
+
+ }
+ if (PidLocked)
+ {
+ fprintf(stderr, _("%s: lock file \"%s\" exists\n"
+ "Is a server running? If not, delete the lock file and try again.\n"),
+ progname, path);
+ exit(1);
+
+ }
/*
! * Print out the values in control file if -n is given. if the control file is
! * corrupted, then inform user to restore it first.
*/
! if (noupdate)
{
! if (!ctlcorrupted)
{
! /* The control file is fine, print the values out.*/
! PrintControlValues();
exit(0);
+ }
+ else{
+ /* The control file is corrupted.*/
+ printf(_("The control file had been corrupted.\n"));
+ printf(_("Please use -r option to restore it first.\n"));
+ exit(1);
+ }
}
/*
* Don't reset from a dirty pg_control without -f, either.
*/
! if (ControlFile.state != DB_SHUTDOWNED && !force && !ctlcorrupted)
{
printf(_("The database server was not shut down cleanly.\n"
"Resetting the transaction log may cause data to be lost.\n"
***************
*** 321,334 ****
exit(1);
}
! /*
! * Else, do the dirty deed.
*/
RewriteControlFile();
KillExistingXLOG();
WriteEmptyXLOG();
!
! printf(_("Transaction log reset\n"));
return 0;
}
--- 440,474 ----
exit(1);
}
! /*
! * Try to reset the xlog file.
*/
+
+ /* If the control file is corrupted, and -f option is given, resotre it first.*/
+ if ( ctlcorrupted )
+ {
+ if (force)
+ {
+ if (!RestoreControlValues(WAL))
+ {
+ printf(_("fails to recover the control file from old xlog files, so we had to guess it.\n"));
+ RestoreControlValues(GUESS);
+ }
+ printf(_("Restored the control file from old xlog files.\n"));
+ }
+ else
+ {
+ printf(_("Control file corrupted.\nIf you want to proceed anyway, use -f to force reset.\n"));
+ exit(1);
+ }
+ }
+
+ /* Reset the xlog fille.*/
+ UpdateCtlFile4Reset();
RewriteControlFile();
KillExistingXLOG();
WriteEmptyXLOG();
! printf(_("Transaction log reset\n"));
return 0;
}
***************
*** 397,403 ****
progname);
/* We will use the data anyway, but treat it as guessed. */
memcpy(&ControlFile, buffer, sizeof(ControlFile));
- guessed = true;
return true;
}
--- 537,542 ----
***************
*** 408,458 ****
}
/*
! * Guess at pg_control values when we can't read the old ones.
*/
! static void
! GuessControlValues(void)
{
- uint64 sysidentifier;
struct timeval tv;
char *localeptr;
/*
* Set up a completely default set of pg_control values.
*/
- guessed = true;
memset(&ControlFile, 0, sizeof(ControlFile));
ControlFile.pg_control_version = PG_CONTROL_VERSION;
ControlFile.catalog_version_no = CATALOG_VERSION_NO;
! /*
! * Create a new unique installation identifier, since we can no longer use
! * any old XLOG records. See notes in xlog.c about the algorithm.
*/
! gettimeofday(&tv, NULL);
! sysidentifier = ((uint64) tv.tv_sec) << 32;
! sysidentifier |= (uint32) (tv.tv_sec | tv.tv_usec);
!
! ControlFile.system_identifier = sysidentifier;
!
! ControlFile.checkPointCopy.redo.xlogid = 0;
! ControlFile.checkPointCopy.redo.xrecoff = SizeOfXLogLongPHD;
! ControlFile.checkPointCopy.undo = ControlFile.checkPointCopy.redo;
! ControlFile.checkPointCopy.ThisTimeLineID = 1;
! ControlFile.checkPointCopy.nextXid = (TransactionId) 514; /* XXX */
! ControlFile.checkPointCopy.nextOid = FirstBootstrapObjectId;
! ControlFile.checkPointCopy.nextMulti = FirstMultiXactId;
! ControlFile.checkPointCopy.nextMultiOffset = 0;
! ControlFile.checkPointCopy.time = time(NULL);
- ControlFile.state = DB_SHUTDOWNED;
ControlFile.time = time(NULL);
! ControlFile.logId = 0;
! ControlFile.logSeg = 1;
! ControlFile.checkPoint = ControlFile.checkPointCopy.redo;
!
ControlFile.maxAlign = MAXIMUM_ALIGNOF;
ControlFile.floatFormat = FLOATFORMAT_VALUE;
ControlFile.blcksz = BLCKSZ;
--- 547,627 ----
}
+
+
/*
! * Restore the pg_control values by scanning old xlog files or by guessing it.
! *
! * Input parameter:
! * WAL: Restore the pg_control values by scanning old xlog files.
! * GUESS: Restore the pg_control values by guessing.
! * Return:
! * TRUE: success in restoring.
! * FALSE: fail to restore the values.
! *
*/
! static bool
! RestoreControlValues(int mode)
{
struct timeval tv;
char *localeptr;
+ bool successed=true;
/*
* Set up a completely default set of pg_control values.
*/
memset(&ControlFile, 0, sizeof(ControlFile));
ControlFile.pg_control_version = PG_CONTROL_VERSION;
ControlFile.catalog_version_no = CATALOG_VERSION_NO;
! /*
! * update the checkpoint value in control file,by searching
! * xlog segment file, or just guessing it.
*/
! if (mode == WAL)
! {
! int result = SearchLastCheckpoint();
! if ( result > 0 ) /* The last checkpoint had been found. */
! {
! ControlFile.checkPointCopy = lastcheckpoint;
! ControlFile.checkPoint = lastchkp;
! ControlFile.prevCheckPoint = prevchkp;
! ControlFile.logId = LastXLogFile->logid;
! ControlFile.logSeg = LastXLogFile->seg + 1;
! ControlFile.checkPointCopy.ThisTimeLineID = LastXLogFile->tli;
! ControlFile.state = state;
! } else successed = false;
!
! /* Clean up the list. */
! CleanUpList(xlogfilelist);
!
! }
!
! if (mode == GUESS)
! {
! ControlFile.checkPointCopy.redo.xlogid = 0;
! ControlFile.checkPointCopy.redo.xrecoff = SizeOfXLogLongPHD;
! ControlFile.checkPointCopy.undo = ControlFile.checkPointCopy.redo;
! ControlFile.checkPointCopy.nextXid = (TransactionId) 514; /* XXX */
! ControlFile.checkPointCopy.nextOid = FirstBootstrapObjectId;
! ControlFile.checkPointCopy.nextMulti = FirstMultiXactId;
! ControlFile.checkPointCopy.nextMultiOffset = 0;
! ControlFile.checkPointCopy.time = time(NULL);
! ControlFile.checkPoint = ControlFile.checkPointCopy.redo;
! /*
! * Create a new unique installation identifier, since we can no longer
! * use any old XLOG records. See notes in xlog.c about the algorithm.
! */
! gettimeofday(&tv, NULL);
! sysidentifier = ((uint64) tv.tv_sec) << 32;
! sysidentifier |= (uint32) (tv.tv_sec | tv.tv_usec);
! ControlFile.state = DB_SHUTDOWNED;
!
! }
ControlFile.time = time(NULL);
! ControlFile.system_identifier = sysidentifier;
ControlFile.maxAlign = MAXIMUM_ALIGNOF;
ControlFile.floatFormat = FLOATFORMAT_VALUE;
ControlFile.blcksz = BLCKSZ;
***************
*** 483,510 ****
}
StrNCpy(ControlFile.lc_ctype, localeptr, LOCALE_NAME_BUFLEN);
! /*
! * XXX eventually, should try to grovel through old XLOG to develop more
! * accurate values for TimeLineID, nextXID, etc.
! */
}
/*
! * Print the guessed pg_control values when we had to guess.
*
* NB: this display should be just those fields that will not be
* reset by RewriteControlFile().
*/
static void
! PrintControlValues(bool guessed)
{
char sysident_str[32];
! if (guessed)
! printf(_("Guessed pg_control values:\n\n"));
! else
! printf(_("pg_control values:\n\n"));
/*
* Format system_identifier separately to keep platform-dependent format
--- 652,673 ----
}
StrNCpy(ControlFile.lc_ctype, localeptr, LOCALE_NAME_BUFLEN);
! return successed;
}
/*
! * Print the out pg_control values.
*
* NB: this display should be just those fields that will not be
* reset by RewriteControlFile().
*/
static void
! PrintControlValues(void)
{
char sysident_str[32];
! printf(_("pg_control values:\n\n"));
/*
* Format system_identifier separately to keep platform-dependent format
***************
*** 538,553 ****
printf(_("LC_CTYPE: %s\n"), ControlFile.lc_ctype);
}
-
/*
! * Write out the new pg_control file.
! */
! static void
! RewriteControlFile(void)
{
- int fd;
- char buffer[PG_CONTROL_SIZE]; /* need not be aligned */
-
/*
* Adjust fields as needed to force an empty XLOG starting at the next
* available segment.
--- 701,712 ----
printf(_("LC_CTYPE: %s\n"), ControlFile.lc_ctype);
}
/*
! * Update the control file before reseting it.
! */
! static void
! UpdateCtlFile4Reset(void)
{
/*
* Adjust fields as needed to force an empty XLOG starting at the next
* available segment.
***************
*** 578,583 ****
--- 737,753 ----
ControlFile.checkPoint = ControlFile.checkPointCopy.redo;
ControlFile.prevCheckPoint.xlogid = 0;
ControlFile.prevCheckPoint.xrecoff = 0;
+ }
+
+ /*
+ * Write out the new pg_control file.
+ */
+ static void
+ RewriteControlFile(void)
+ {
+ int fd;
+ char buffer[PG_CONTROL_SIZE]; /* need not be aligned */
+
/* Contents are protected with a CRC */
INIT_CRC32(ControlFile.crc);
***************
*** 672,678 ****
errno = 0;
}
#ifdef WIN32
-
/*
* This fix is in mingw cvs (runtime/mingwex/dirent.c rev 1.4), but not in
* released version
--- 842,847 ----
***************
*** 801,814 ****
printf(_("%s resets the PostgreSQL transaction log.\n\n"), progname);
printf(_("Usage:\n %s [OPTION]... DATADIR\n\n"), progname);
printf(_("Options:\n"));
! printf(_(" -f force update to be done\n"));
printf(_(" -l TLI,FILE,SEG force minimum WAL starting location for new transaction log\n"));
! printf(_(" -m XID set next multitransaction ID\n"));
! printf(_(" -n no update, just show extracted control values (for testing)\n"));
printf(_(" -o OID set next OID\n"));
! printf(_(" -O OFFSET set next multitransaction offset\n"));
printf(_(" -x XID set next transaction ID\n"));
printf(_(" --help show this help, then exit\n"));
printf(_(" --version output version information, then exit\n"));
printf(_("\nReport bugs to .\n"));
}
--- 970,1633 ----
printf(_("%s resets the PostgreSQL transaction log.\n\n"), progname);
printf(_("Usage:\n %s [OPTION]... DATADIR\n\n"), progname);
printf(_("Options:\n"));
! printf(_(" -f force reset xlog to be done, if the control file is corrupted, then try to restore it.\n"));
! printf(_(" -r restore the pg_control file from old XLog files, resets is not done..\n"));
printf(_(" -l TLI,FILE,SEG force minimum WAL starting location for new transaction log\n"));
! printf(_(" -n show extracted control values of existing pg_control file.\n"));
! printf(_(" -m multiXID set next multi transaction ID\n"));
printf(_(" -o OID set next OID\n"));
! printf(_(" -O multiOffset set next multi transaction offset\n"));
printf(_(" -x XID set next transaction ID\n"));
printf(_(" --help show this help, then exit\n"));
printf(_(" --version output version information, then exit\n"));
printf(_("\nReport bugs to .\n"));
}
+
+
+
+ /*
+ * The following routines are mainly used for getting pg_control values
+ * from the xlog file.
+ */
+
+ /* some local varaibles.*/
+ static int logFd=0; /* kernel FD for current input file */
+ static int logRecOff; /* offset of next record in page */
+ static char pageBuffer[BLCKSZ]; /* current page */
+ static XLogRecPtr curRecPtr; /* logical address of current record */
+ static XLogRecPtr prevRecPtr; /* logical address of previous record */
+ static char *readRecordBuf = NULL; /* ReadRecord result area */
+ static uint32 readRecordBufSize = 0;
+ static int32 logPageOff; /* offset of current page in file */
+ static uint32 logId; /* current log file id */
+ static uint32 logSeg; /* current log file segment */
+ static uint32 logTli; /* current log file timeline */
+
+ /*
+ * Get existing XLOG files
+ */
+ static void
+ GetXLogFiles(void)
+ {
+ DIR *xldir;
+ struct dirent *xlde;
+
+ /* Open the xlog direcotry.*/
+ xldir = opendir(XLOGDIR);
+ if (xldir == NULL)
+ {
+ fprintf(stderr, _("%s: could not open directory \"%s\": %s\n"),
+ progname, XLOGDIR, strerror(errno));
+ exit(1);
+ }
+
+ /* Search the directory, insert the segment files into the xlogfilelist.*/
+ errno = 0;
+ while ((xlde = readdir(xldir)) != NULL)
+ {
+ if (ValidXLogFileName(xlde->d_name)) {
+ /* XLog file is found, insert it into the xlogfilelist.*/
+ InsertXLogFile(xlde->d_name);
+ };
+ errno = 0;
+ }
+ #ifdef WIN32
+ if (GetLastError() == ERROR_NO_MORE_FILES)
+ errno = 0;
+ #endif
+
+ if (errno)
+ {
+ fprintf(stderr, _("%s: could not read from directory \"%s\": %s\n"),
+ progname, XLOGDIR, strerror(errno));
+ exit(1);
+ }
+ closedir(xldir);
+ }
+
+ /*
+ * Insert a file while had been found in the xlog folder into xlogfilelist.
+ * The xlogfile list is matained in a increasing order.
+ *
+ * The input parameter is the name of the xlog file, the name is assumpted
+ * valid.
+ */
+ static void
+ InsertXLogFile( char * fname )
+ {
+ XLogFileName * NewSegFile, *Curr, *Prev;
+ bool append2end = false;
+
+ /* Allocate a new node for the new file. */
+ NewSegFile = (XLogFileName *) malloc(sizeof(XLogFileName));
+ strcpy(NewSegFile->fname,fname); /* setup the name */
+ /* extract the time line, logid, and segment number from the name.*/
+ sscanf(fname, "%8x%8x%8x", &(NewSegFile->tli), &(NewSegFile->logid), &(NewSegFile->seg));
+ NewSegFile->next = NULL;
+
+ /* Ensure the xlog file is active and valid.*/
+ if (! ValidXLogFileHeader(NewSegFile))
+ {
+ free(NewSegFile);
+ return;
+ }
+
+ /* the list is empty.*/
+ if ( xlogfilelist == NULL ) {
+ xlogfilelist = NewSegFile;
+ return;
+ };
+
+ /* try to search the list and find the insert point. */
+ Prev=Curr=xlogfilelist;
+ while( CmpXLogFileOT(NewSegFile, Curr))
+ {
+ /* the node is appended to the end of the list.*/
+ if (Curr->next == NULL)
+ {
+ append2end = true;
+ break;
+ }
+ Prev=Curr;
+ Curr = Curr->next;
+ }
+
+ /* Insert the new node to the list.*/
+ if ( append2end )
+ {
+ /* We need to append the new node to the end of the list */
+ Curr->next = NewSegFile;
+ }
+ else
+ {
+ NewSegFile->next = Curr;
+ /* prev should not be the list head. */
+ if ( Prev != NULL && Prev != xlogfilelist)
+ {
+ Prev->next = NewSegFile;
+ }
+ }
+ /* Update the list head if it is needed.*/
+ if ((Curr == xlogfilelist) && !append2end)
+ {
+ xlogfilelist = NewSegFile;
+ }
+
+ }
+
+ /*
+ * compare two xlog file from their name to see which one is latest.
+ *
+ * Return true for file 2 is the lastest file.
+ *
+ */
+ static bool
+ CmpXLogFileOT(XLogFileName * f1, XLogFileName *f2)
+ {
+ if (f2->tli >= f1->tli)
+ {
+ if (f2->logid >= f1->logid)
+ {
+ if (f2->seg > f1->seg) return false;
+ }
+ }
+ return true;
+
+ }
+
+ /* check is two segment file is continous.*/
+ static bool
+ IsNextSeg(XLogFileName *prev, XLogFileName *cur)
+ {
+ uint32 logid, logseg;
+
+ if (prev->tli != cur->tli) return false;
+
+ logid = prev->logid;
+ logseg = prev->seg;
+ NextLogSeg(logid, logseg);
+
+ if ((logid == cur->logid) && (logseg == cur->seg)) return true;
+
+ return false;
+
+ }
+
+
+ /*
+ * Select the oldest xlog file in the latest time line.
+ */
+ static void
+ SelectStartXLog( void )
+ {
+ XLogFileName *tmp;
+ CurXLogFile = xlogfilelist;
+
+ if (xlogfilelist == NULL)
+ {
+ return;
+ }
+
+ tmp=LastXLogFile=CurXLogFile=xlogfilelist;
+
+ while(tmp->next != NULL)
+ {
+
+ /*
+ * we should ensure that from the first to
+ * the last segment file is continous.
+ * */
+ if (!IsNextSeg(tmp, tmp->next))
+ {
+ CurXLogFile = tmp->next;
+ }
+ tmp=tmp->next;
+ }
+
+ LastXLogFile = tmp;
+
+ }
+
+ /*
+ * Check if the file is a valid xlog file.
+ *
+ * Return true for the input file is a valid xlog file.
+ *
+ * The input parameter is the name of the xlog file.
+ *
+ */
+ static bool
+ ValidXLogFileName(char * fname)
+ {
+ uint logTLI, logId, logSeg;
+ if (strlen(fname) != 24 ||
+ strspn(fname, "0123456789ABCDEF") != 24 ||
+ sscanf(fname, "%8x%8x%8x", &logTLI, &logId, &logSeg) != 3)
+ return false;
+ return true;
+
+ }
+
+ /* Ensure the xlog file is active and valid.*/
+ static bool
+ ValidXLogFileHeader(XLogFileName *segfile)
+ {
+ int fd;
+ char buffer[BLCKSZ];
+ char path[MAXPGPATH];
+ size_t nread;
+
+ snprintf(path, MAXPGPATH, "%s/%s", XLOGDIR, segfile->fname);
+ fd = open(path, O_RDONLY | PG_BINARY, 0);
+ if (fd < 0)
+ {
+ return false;
+ }
+ nread = read(fd, buffer, BLCKSZ);
+ if (nread == BLCKSZ)
+ {
+ XLogPageHeader hdr = (XLogPageHeader)buffer;
+
+ if (ValidXLOGPageHeader(hdr, segfile->tli, segfile->logid, segfile->seg))
+ {
+ return true;
+ }
+
+ }
+ return false;
+
+ }
+ static bool
+ ValidXLOGPageHeader(XLogPageHeader hdr, uint tli, uint id, uint seg)
+ {
+ XLogRecPtr recaddr;
+
+ if (hdr->xlp_magic != XLOG_PAGE_MAGIC)
+ {
+ return false;
+ }
+ if ((hdr->xlp_info & ~XLP_ALL_FLAGS) != 0)
+ {
+ return false;
+ }
+ if (hdr->xlp_info & XLP_LONG_HEADER)
+ {
+ XLogLongPageHeader longhdr = (XLogLongPageHeader) hdr;
+
+ if (longhdr->xlp_seg_size != XLogSegSize)
+ {
+ return false;
+ }
+ /* Get the system identifier from the segment file header.*/
+ sysidentifier = ((XLogLongPageHeader) pageBuffer)->xlp_sysid;
+ }
+
+ recaddr.xlogid = id;
+ recaddr.xrecoff = seg * XLogSegSize + logPageOff;
+ if (!XLByteEQ(hdr->xlp_pageaddr, recaddr))
+ {
+ return false;
+ }
+
+ if (hdr->xlp_tli != tli)
+ {
+ return false;
+ }
+ return true;
+ }
+
+
+ /* Read another page, if possible */
+ static bool
+ ReadXLogPage(void)
+ {
+ size_t nread;
+
+ /* Need to advance to the new segment file.*/
+ if ( logPageOff >= XLogSegSize )
+ {
+ close(logFd);
+ logFd = 0;
+ }
+
+ /* Need to open the segement file.*/
+ if ((logFd <= 0) && (CurXLogFile != NULL))
+ {
+ if (OpenXLogFile(CurXLogFile) < 0)
+ {
+ return false;
+ }
+ CurXLogFile = CurXLogFile->next;
+ }
+
+ /* Read a page from the openning segement file.*/
+ nread = read(logFd, pageBuffer, BLCKSZ);
+
+ if (nread == BLCKSZ)
+ {
+ logPageOff += BLCKSZ;
+ if (ValidXLOGPageHeader( (XLogPageHeader)pageBuffer, logTli, logId, logSeg))
+ return true;
+ }
+
+ return false;
+ }
+
+ /*
+ * CRC-check an XLOG record. We do not believe the contents of an XLOG
+ * record (other than to the minimal extent of computing the amount of
+ * data to read in) until we've checked the CRCs.
+ *
+ * We assume all of the record has been read into memory at *record.
+ */
+ static bool
+ RecordIsValid(XLogRecord *record, XLogRecPtr recptr)
+ {
+ pg_crc32 crc;
+ int i;
+ uint32 len = record->xl_len;
+ BkpBlock bkpb;
+ char *blk;
+
+ /* First the rmgr data */
+ INIT_CRC32(crc);
+ COMP_CRC32(crc, XLogRecGetData(record), len);
+
+ /* Add in the backup blocks, if any */
+ blk = (char *) XLogRecGetData(record) + len;
+ for (i = 0; i < XLR_MAX_BKP_BLOCKS; i++)
+ {
+ uint32 blen;
+
+ if (!(record->xl_info & XLR_SET_BKP_BLOCK(i)))
+ continue;
+
+ memcpy(&bkpb, blk, sizeof(BkpBlock));
+ if (bkpb.hole_offset + bkpb.hole_length > BLCKSZ)
+ {
+ return false;
+ }
+ blen = sizeof(BkpBlock) + BLCKSZ - bkpb.hole_length;
+ COMP_CRC32(crc, blk, blen);
+ blk += blen;
+ }
+
+ /* Check that xl_tot_len agrees with our calculation */
+ if (blk != (char *) record + record->xl_tot_len)
+ {
+ return false;
+ }
+
+ /* Finally include the record header */
+ COMP_CRC32(crc, (char *) record + sizeof(pg_crc32),
+ SizeOfXLogRecord - sizeof(pg_crc32));
+ FIN_CRC32(crc);
+
+ if (!EQ_CRC32(record->xl_crc, crc))
+ {
+ return false;
+ }
+
+ return true;
+ }
+
+
+
+ /*
+ * Attempt to read an XLOG record into readRecordBuf.
+ */
+ static bool
+ FetchRecord(void)
+ {
+ char *buffer;
+ XLogRecord *record;
+ XLogContRecord *contrecord;
+ uint32 len, total_len;
+
+
+ while (logRecOff <= 0 || logRecOff > BLCKSZ - SizeOfXLogRecord)
+ {
+ /* Need to advance to new page */
+ if (! ReadXLogPage())
+ {
+ return false;
+ }
+
+ logRecOff = XLogPageHeaderSize((XLogPageHeader) pageBuffer);
+ if ((((XLogPageHeader) pageBuffer)->xlp_info & ~XLP_LONG_HEADER) != 0)
+ {
+ /* Check for a continuation record */
+ if (((XLogPageHeader) pageBuffer)->xlp_info & XLP_FIRST_IS_CONTRECORD)
+ {
+ contrecord = (XLogContRecord *) (pageBuffer + logRecOff);
+ logRecOff += MAXALIGN(contrecord->xl_rem_len + SizeOfXLogContRecord);
+ }
+ }
+ }
+
+ curRecPtr.xlogid = logId;
+ curRecPtr.xrecoff = logSeg * XLogSegSize + logPageOff + logRecOff;
+ record = (XLogRecord *) (pageBuffer + logRecOff);
+
+ if (record->xl_len == 0)
+ {
+ return false;
+ }
+
+ total_len = record->xl_tot_len;
+
+ /*
+ * Allocate or enlarge readRecordBuf as needed. To avoid useless
+ * small increases, round its size to a multiple of BLCKSZ, and make
+ * sure it's at least 4*BLCKSZ to start with. (That is enough for all
+ * "normal" records, but very large commit or abort records might need
+ * more space.)
+ */
+ if (total_len > readRecordBufSize)
+ {
+ uint32 newSize = total_len;
+
+ newSize += BLCKSZ - (newSize % BLCKSZ);
+ newSize = Max(newSize, 4 * BLCKSZ);
+ if (readRecordBuf)
+ free(readRecordBuf);
+ readRecordBuf = (char *) malloc(newSize);
+ if (!readRecordBuf)
+ {
+ readRecordBufSize = 0;
+ return false;
+ }
+ readRecordBufSize = newSize;
+ }
+
+ buffer = readRecordBuf;
+ len = BLCKSZ - curRecPtr.xrecoff % BLCKSZ; /* available in block */
+ if (total_len > len)
+ {
+ /* Need to reassemble record */
+ uint32 gotlen = len;
+
+ memcpy(buffer, record, len);
+ record = (XLogRecord *) buffer;
+ buffer += len;
+ for (;;)
+ {
+ uint32 pageHeaderSize;
+
+ if (!ReadXLogPage())
+ {
+ return false;
+ }
+ if (!(((XLogPageHeader) pageBuffer)->xlp_info & XLP_FIRST_IS_CONTRECORD))
+ {
+ return false;
+ }
+ pageHeaderSize = XLogPageHeaderSize((XLogPageHeader) pageBuffer);
+ contrecord = (XLogContRecord *) (pageBuffer + pageHeaderSize);
+ if (contrecord->xl_rem_len == 0 ||
+ total_len != (contrecord->xl_rem_len + gotlen))
+ {
+ return false;
+ }
+ len = BLCKSZ - pageHeaderSize - SizeOfXLogContRecord;
+ if (contrecord->xl_rem_len > len)
+ {
+ memcpy(buffer, (char *)contrecord + SizeOfXLogContRecord, len);
+ gotlen += len;
+ buffer += len;
+ continue;
+ }
+ memcpy(buffer, (char *) contrecord + SizeOfXLogContRecord,
+ contrecord->xl_rem_len);
+ logRecOff = MAXALIGN(pageHeaderSize + SizeOfXLogContRecord + contrecord->xl_rem_len);
+ break;
+ }
+ if (!RecordIsValid(record, curRecPtr))
+ {
+ return false;
+ }
+ return true;
+ }
+ /* Record is contained in this page */
+ memcpy(buffer, record, total_len);
+ record = (XLogRecord *) buffer;
+ logRecOff += MAXALIGN(total_len);
+ if (!RecordIsValid(record, curRecPtr))
+ {
+
+ return false;
+ }
+ return true;
+ }
+
+ /*
+ * if the record is checkpoint, update the lastest checkpoint record.
+ */
+ static void
+ UpdateCheckPoint(XLogRecord *record)
+ {
+ uint8 info = record->xl_info & ~XLR_INFO_MASK;
+
+ if ((info == XLOG_CHECKPOINT_SHUTDOWN) ||
+ (info == XLOG_CHECKPOINT_ONLINE))
+ {
+ CheckPoint *chkpoint = (CheckPoint*) XLogRecGetData(record);
+ prevchkp = lastchkp;
+ lastchkp = curRecPtr;
+ lastcheckpoint = *chkpoint;
+
+ /* update the database state.*/
+ switch(info)
+ {
+ case XLOG_CHECKPOINT_SHUTDOWN:
+ state = DB_SHUTDOWNED;
+ break;
+ case XLOG_CHECKPOINT_ONLINE:
+ state = DB_IN_PRODUCTION;
+ break;
+ }
+ found_checkpoint ++ ;
+ }
+ }
+
+ static int
+ OpenXLogFile(XLogFileName *sf)
+ {
+
+ char path[MAXPGPATH];
+
+ if ( logFd > 0 ) close(logFd);
+
+ /* Open a Xlog segment file. */
+ snprintf(path, MAXPGPATH, "%s/%s", XLOGDIR, sf->fname);
+ logFd = open(path, O_RDONLY | PG_BINARY, 0);
+
+ if (logFd < 0)
+ {
+ fprintf(stderr, _("%s: Can not open xlog file %s.\n"), progname,path);
+ return -1;
+ }
+
+ /* Setup the parameter for searching. */
+ logPageOff = -BLCKSZ; /* so 1st increment in readXLogPage gives 0 */
+ logRecOff = 0;
+ logId = sf->logid;
+ logSeg = sf->seg;
+ logTli = sf->tli;
+ return logFd;
+ }
+
+ /*
+ * Search the lastest checkpoint in the lastest XLog segment file.
+ *
+ * The return value is the total checkpoints which had been found
+ * in the XLog segment file.
+ */
+ static int
+ SearchLastCheckpoint(void)
+ {
+
+ /* retrive all of the active xlog files from xlog direcotry
+ * into a list by increasing order, according their timeline,
+ * log id, segment id.
+ */
+ GetXLogFiles();
+
+ /* Select the oldest segment file in the lastest time line.*/
+ SelectStartXLog();
+
+ /* No segment file was found.*/
+ if ( CurXLogFile == NULL )
+ {
+ return 0;
+ }
+
+ /* initial it . */
+ logFd=logId=logSeg=logTli=0;
+
+ /*
+ * Search the XLog segment file from beginning to end,
+ * if checkpoint record is found, then update the
+ * latest check point.
+ */
+ while (FetchRecord())
+ {
+ /* To see if the record is checkpoint record. */
+ if (((XLogRecord *) readRecordBuf)->xl_rmid == RM_XLOG_ID)
+ UpdateCheckPoint((XLogRecord *) readRecordBuf);
+ prevRecPtr = curRecPtr;
+ }
+
+ /* We can not know clearly if we had reached the end.
+ * But just check if we reach the last segment file,
+ * if it is not, then some problem there.
+ * (We need a better way to know the abnormal broken during the search)
+ */
+ if ((logId != LastXLogFile->logid) && (logSeg != LastXLogFile->seg))
+ {
+ return 0;
+ }
+
+ /*
+ * return the checkpoints which had been found yet,
+ * let others know how much checkpointes are found.
+ */
+ return found_checkpoint;
+ }
+
+ /* Clean up the allocated list.*/
+ static void
+ CleanUpList(XLogFileName *list)
+ {
+
+ XLogFileName *tmp;
+ tmp = list;
+ while(list != NULL)
+ {
+ tmp=list->next;
+ free(list);
+ list=tmp;
+ }
+
+ }
+