From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
---|---|
To: | Jeff Davis <pgsql(at)j-davis(dot)com> |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: Recovery bug |
Date: | 2010-10-18 08:02:49 |
Message-ID: | AANLkTi=ffepY1tpduZBxLGBycg7NekF=KuacC4MW=Wwj@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
>> Send a SIGQUIT to the postmaster to simulate a crash. When you bring it
>> back up, it thinks it is recovering from a backup, so it reads
>> backup_label. The checkpoint for the backup label is in 00...6, so it
>> reads that just fine. But then it tries to read the WAL starting at the
>> redo location from that checkpoint, which is in 00...5 and it doesn't
>> exist and PANICs.
>>
>> Ordinarily you might say that this is just confusion over whether it's
>> recovering from a backup or not, and you just need to remove
>> backup_label and try again. But that doesn't work: at this point
>> StartupXLOG has already done two things:
>> 1. renamed the backup file to .old
>> 2. updated the control file
Good catch!
> I still think it would be nice if postgres knew whether it was restoring
> a backup or recovering from a crash, otherwise it's hard to
> automatically recover from failures. I thought about using the presence
> of recoveryRestoreCommand or PrimaryConnInfo to determine that. But it
> seemed potentially dangerous if the person restoring a backup simply
> forgot to set those, and then it tries restoring from the controldata
> instead (which is unsafe to do during a backup).
Yep, to automatically delete backup_label and continue recovery seems to be
dangerous. How about just emitting FATAL error when neither restore_command
nor primary_conninfo is supplied and backup_label exists? This seems to be
simpler than your proposed patch (i.e., check whether REDO location exists).
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Sebastian Frey | 2010-10-18 14:10:38 | How do I remove PostgreSQL completely? |
Previous Message | Joel Lopes Da Silva | 2010-10-18 06:31:10 | BUG #5715: man pages missing after compiling PostgreSQL 9.0.1 sources on OS X 10.6 |