Quick Links

Re: bug of recovery?

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc:	Florian Pflug <fgp(at)phlo(dot)org>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: bug of recovery?
Date:	2011-09-30 06:57:02
Message-ID:	CA+U5nMK+S+bqZwEpwQUaVoE+gzqspwq8rp-6TQPFP77kyGS1NQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Sep 30, 2011 at 2:09 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Thu, Sep 29, 2011 at 11:12 PM, Florian Pflug <fgp(at)phlo(dot)org> wrote:
>> On Sep29, 2011, at 13:49 , Simon Riggs wrote:
>>> This worries me slightly now though because the patch makes us PANIC
>>> in a place we didn't used to and once we do that we cannot restart the
>>> server at all. Are we sure we want that? It's certainly a great way to
>>> shake down errors in other code...
>>
>> The patch only introduces a new PANIC condition during archive recovery,
>> though. Crash recovery is unaffected, except that we no longer create
>> restart points before we reach consistency.
>>
>> Also, if we hit an invalid page reference after reaching consistency,
>> the cause is probably either a bug in our recovery code, or (quite unlikely)
>> a corrupted WAL that passed the CRC check. In both cases, the likelyhood
>> of data-corruption seems high, so PANICing seems like the right thing to do.
>
> Fair enough.
>
> We might be able to use FATAL or ERROR instead of PANIC because they
> also cause all processes to exit when the startup process emits them.
> For example, we now use FATAL to stop the server in recovery mode
> when recovery is about to end before we've reached a consistent state.

I think we should issue PANIC if the source is a critical rmgr, or
just WARNING if from a non-critical rmgr, such as indexes.

Ideally, I think we should have a mechanism to allow indexes to be
marked corrupt. For example, a file that if present shows that the
index is corrupt and would be marked not valid. We can then create the
file and send a sinval message to force the index relcache to be
rebuilt showing valid set to false.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Re: bug of recovery? at 2011-09-30 01:09:07 from Fujii Masao

Responses

Re: bug of recovery? at 2011-10-03 05:23:25 from Fujii Masao

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Simon Riggs	2011-09-30 07:18:53	Re: [REVIEW] pg_last_xact_insert_timestamp
Previous Message	Kyotaro HORIGUCHI	2011-09-30 02:24:43	Re: [REVIEW] pg_last_xact_insert_timestamp