Re: Point in Time Recovery

From: Mark Kirkwood <markir(at)coretech(dot)co(dot)nz>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Point in Time Recovery
Date: 2004-07-15 23:46:54
Message-ID: 40F7176E.4000001@coretech.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-hackers pgsql-patches

Simon Riggs wrote:

>
>So far:
>
>I've tried to re-create the problem as exactly as I can, but it works
>for me.
>
>This is clearly an important case to chase down.
>
>I assume that this is the very first time you tried recovery? Second and
>subsequent recoveries using the same set have a potential loophole,
>which we have been discussing.
>
>Right now, I'm thinking that the "exactly 2 logs worth" of data has
>brought you very close to the end of the log file (FFFFE0) ending with 1
>and the shutdown checkpoint that is then subsequently written is
>failing.
>
>Can you repeat this your end?
>
>
>
It is repeatable at my end. It is actually fairly easy to recreate the
example I am using, download

http://sourceforge.net/projects/benchw

and generate the dataset for Pg - but trim the large "fact0.dat" dump
file using head -100000.
Thus step 7 consists of creating the 4 tables and COPYing in the data
for them.

>The nearest I can get to the exact record pointers you show are to start
>recovery at A4807C and to end at with FFFF88.
>
>Overall, PITR changes the recovery process very little, if at all. The
>main areas of effect are to do with sequencing of actions and matching
>up the right logs with the right backup. I'm not looking for bugs in the
>code but in subtle side-effects and "edge" cases. Everything you can
>tell me will help me greatly in chasing that down.
>
>
>
I agree - I will try this sort of example again, but will change the
number of rows I am COPYing (currently 100000) and see if that helps.

>Best Regards, Simon Riggs
>
>
>

By way of contrast, using the *same* procedure (1-11), but generating 2
logs worth of INSERTS/UPDATES using 10 concurrent process *works fine* -
e.g :

LOG: database system was interrupted at 2004-07-16 11:17:52 NZST
LOG: recovery command file found...
LOG: restore_program = cp %s/%s %s
LOG: recovery_target_inclusive = true
LOG: recovery_debug_log = true
LOG: starting archive recovery
LOG: restored log file "0000000000000000" from archive
LOG: checkpoint record is at 0/A4803C
LOG: redo record is at 0/A4803C; undo record is at 0/0; shutdown FALSE
LOG: next transaction ID: 496; next OID: 25419
LOG: database system was not properly shut down; automatic recovery in
progress
LOG: redo starts at 0/A4807C
postmaster starting
[postgres(at)shroudeater 7.5]$ LOG: restored log file "0000000000000001"
from archive
cp: cannot stat `/data1/pgdata/7.5-archive/0000000000000002': No such
file or directory
LOG: could not restore "0000000000000002" from archive
LOG: could not open file "/data1/pgdata/7.5/pg_xlog/0000000000000002"
(log file 0, segment 2): No such file or directory
LOG: redo done at 0/1FFFFD4
LOG: archive recovery complete
LOG: database system is ready
LOG: archiver started

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Simon Riggs 2004-07-15 23:51:53 Re: Point in Time Recovery
Previous Message Mark Kirkwood 2004-07-15 23:13:20 Re: Point in Time Recovery

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2004-07-15 23:51:53 Re: Point in Time Recovery
Previous Message Tom Lane 2004-07-15 23:42:16 Re: Very strange Error in Updates

Browse pgsql-patches by date

  From Date Subject
Next Message Simon Riggs 2004-07-15 23:51:53 Re: Point in Time Recovery
Previous Message Mark Kirkwood 2004-07-15 23:13:20 Re: Point in Time Recovery