From: | Mark Kirkwood <markir(at)coretech(dot)co(dot)nz> |
---|---|
To: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Point in Time Recovery |
Date: | 2004-07-15 23:46:54 |
Message-ID: | 40F7176E.4000001@coretech.co.nz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin pgsql-hackers pgsql-patches |
Simon Riggs wrote:
>
>So far:
>
>I've tried to re-create the problem as exactly as I can, but it works
>for me.
>
>This is clearly an important case to chase down.
>
>I assume that this is the very first time you tried recovery? Second and
>subsequent recoveries using the same set have a potential loophole,
>which we have been discussing.
>
>Right now, I'm thinking that the "exactly 2 logs worth" of data has
>brought you very close to the end of the log file (FFFFE0) ending with 1
>and the shutdown checkpoint that is then subsequently written is
>failing.
>
>Can you repeat this your end?
>
>
>
It is repeatable at my end. It is actually fairly easy to recreate the
example I am using, download
http://sourceforge.net/projects/benchw
and generate the dataset for Pg - but trim the large "fact0.dat" dump
file using head -100000.
Thus step 7 consists of creating the 4 tables and COPYing in the data
for them.
>The nearest I can get to the exact record pointers you show are to start
>recovery at A4807C and to end at with FFFF88.
>
>Overall, PITR changes the recovery process very little, if at all. The
>main areas of effect are to do with sequencing of actions and matching
>up the right logs with the right backup. I'm not looking for bugs in the
>code but in subtle side-effects and "edge" cases. Everything you can
>tell me will help me greatly in chasing that down.
>
>
>
I agree - I will try this sort of example again, but will change the
number of rows I am COPYing (currently 100000) and see if that helps.
>Best Regards, Simon Riggs
>
>
>
By way of contrast, using the *same* procedure (1-11), but generating 2
logs worth of INSERTS/UPDATES using 10 concurrent process *works fine* -
e.g :
LOG: database system was interrupted at 2004-07-16 11:17:52 NZST
LOG: recovery command file found...
LOG: restore_program = cp %s/%s %s
LOG: recovery_target_inclusive = true
LOG: recovery_debug_log = true
LOG: starting archive recovery
LOG: restored log file "0000000000000000" from archive
LOG: checkpoint record is at 0/A4803C
LOG: redo record is at 0/A4803C; undo record is at 0/0; shutdown FALSE
LOG: next transaction ID: 496; next OID: 25419
LOG: database system was not properly shut down; automatic recovery in
progress
LOG: redo starts at 0/A4807C
postmaster starting
[postgres(at)shroudeater 7.5]$ LOG: restored log file "0000000000000001"
from archive
cp: cannot stat `/data1/pgdata/7.5-archive/0000000000000002': No such
file or directory
LOG: could not restore "0000000000000002" from archive
LOG: could not open file "/data1/pgdata/7.5/pg_xlog/0000000000000002"
(log file 0, segment 2): No such file or directory
LOG: redo done at 0/1FFFFD4
LOG: archive recovery complete
LOG: database system is ready
LOG: archiver started
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2004-07-15 23:51:53 | Re: Point in Time Recovery |
Previous Message | Mark Kirkwood | 2004-07-15 23:13:20 | Re: Point in Time Recovery |
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2004-07-15 23:51:53 | Re: Point in Time Recovery |
Previous Message | Tom Lane | 2004-07-15 23:42:16 | Re: Very strange Error in Updates |
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2004-07-15 23:51:53 | Re: Point in Time Recovery |
Previous Message | Mark Kirkwood | 2004-07-15 23:13:20 | Re: Point in Time Recovery |