Re: Recovery from multi trouble

From: OKADA Satoshi <okada(dot)satoshi(at)lab(dot)ntt(dot)co(dot)jp>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Recovery from multi trouble
Date: 2006-03-27 03:14:47
Message-ID: 442758A7.8020201@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Martijn van Oosterhout wrote:

>On Thu, Dec 22, 2005 at 10:53:39AM +0000, Simon Riggs wrote:
>
>
>>IMHO the problem is the deletion of the xlog file, not the error
>>message.
>>
>>If you *did* lose an xlog file, would you not expect the system to come
>>up anyway? You're saying that you'd want the system to stay down because
>>of this? Would you want the system to be less available in that
>>circumstance?
>>
>>
>
>Well, that's what pg_resetxlog does. If you have an unclean shutdown
>and you lose the xlog, you've possibly lost data. Should the postmaster
>just come up and pretend nothing happened?
>
>
>
>>I guess you want might a new postmaster option: "don't come up if you
>>are damaged". Would you really use that?
>>
>>
>
>Well, we have zero_damaged_pages, which is off by default.
>
>
>
>>Overall, thank you for doing the durability testing. It is good to know
>>that you're doing that and taking the time to report any issues you see.
>>
>>
>
>Having a system that just blithely continues in the face of possible
>data loss doesn't seem very nice either. Sure, it's nice to know about
>it but is it really something we can do something about? The admin
>either restores from backup or runs pg_resetxlog, accepting the fact
>data will be lost. I don't think this is something postgres should be
>doing on its own.
>
>

Thank you for comment, and I'm sorry that my reply is too late.

Our aim is giving database recovery chances to a database administrator
at PostgreSQL startup time when there is possibility of data loss of
losing log files.

Because we plan to use clustering software that switches server
machine automatically when PostgreSQL server is down by any trouble,
it is adequate that PostgreSQL should stop a startup process and tell
us an anomaly in such a case. And then, we should do (manually)
a proper recovery operation by the database administrator.

Therefore, we think new startup option of postmaster;
* Introduce new startup option(ex. postmaster -R).
* When postmaster can't read xlog and startup with this option,
postmaster stop startup process with "PANIC".

We will make a patch. Do you think this patch?

----
OKADA Satoshi
NTT Cyberspace Lab.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Qingqing Zhou 2006-03-27 04:40:31 Re: PANIC: heap_update_redo: no block
Previous Message Andrew Dunstan 2006-03-27 01:27:41 Re: How to create the patch?