Re: bug of recovery?

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: bug of recovery?
Date: 2011-09-29 11:31:11
Message-ID: CAHGQGwHDmzMs-GjYw-k=Ob+YSeK9c9KpDcBzbm1BDNNp5xcb1g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 27, 2011 at 8:06 PM, Florian Pflug <fgp(at)phlo(dot)org> wrote:
> On Sep27, 2011, at 07:59 , Heikki Linnakangas wrote:
>> On 27.09.2011 00:28, Florian Pflug wrote:
>>> On Sep26, 2011, at 22:39 , Tom Lane wrote:
>>>> It might be worthwhile to invoke XLogCheckInvalidPages() as soon as
>>>> we (think we) have reached consistency, rather than leaving it to be
>>>> done only when we exit recovery mode.
>>>
>>> I believe we also need to prevent the creation of restart points before
>>> we've reached consistency.
>>
>> Seems reasonable. We could still allow restartpoints when the hash table is empty, though. And once we've reached consistency, we can throw an error immediately in log_invalid_page(), instead of adding the entry in the hash table.
>
> That mimics the way the rm_safe_restartpoint callbacks work, which is good.
>
> Actually, why don't we use that machinery to implement this? There's currently no rm_safe_restartpoint callback for RM_XLOG_ID, so we'd just need to create one that checks whether invalid_page_tab is empty.

Okay, the attached patch prevents the creation of restartpoints by using
rm_safe_restartpoint callback if we've not reached a consistent state yet
and the invalid-page table is not empty. But the invalid-page table is not
tied to the specific resource manager, so using rm_safe_restartpoint for
that seems to slightly odd. Is this OK?

Also, according to other suggestions, the patch changes XLogCheckInvalidPages()
so that it's called as soon as we've reached a consistent state, and changes
log_invalid_page() so that it emits PANIC immediately if consistency is already
reached. These are very good changes, I think. Because they enable us to
notice serious problem which causes PANIC error immediately. Without these
changes, you unfortunately might notice that the standby database is corrupted
when failover happens. Though such a problem might rarely happen, I think it's
worth doing those changes.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment Content-Type Size
invalid_page_table_v1.patch text/x-patch 6.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2011-09-29 11:49:11 Re: bug of recovery?
Previous Message Kyotaro HORIGUCHI 2011-09-29 10:24:26 Re: [v9.2] make_greater_string() does not return a string in some cases