Re: recovery getting interrupted is not so unusual as it used to be

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: recovery getting interrupted is not so unusual as it used to be
Date: 2010-06-02 22:58:29
Message-ID: AANLkTimB7BhZHLDnmohja7qnIg54NayGh897ilIY1kDn@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 2, 2010 at 5:39 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> On 02/06/10 23:50, Robert Haas wrote:
>>
>> First, is it appropriate to set the control file state to
>> DB_SHUTDOWNED_IN_RECOVERY even when we're in crash recovery (as
>> opposed to archive recovery/SR)?  My vote is no, but Heikki thought it
>> might be OK.
>
> My logic on that is:
>
> If the database is known to be in good shape, i.e not corrupt, after
> shutdown during crash recovery, then we should not print the warning at
> restart saying "This probably means that some data is corrupted". There's no
> reason to believe the database is corrupt if it's a controlled shutdown, so
> setting control file state to DB_SHUTDOWNED_IN_RECOVERY is OK. But if it's
> not OK for some reason, then we really shouldn't allow the shut down in the
> first place until we hit the end of WAL.
>
> So the option "allow shutdown, but warn at restart that your data is
> probably corrupt" does not make sense in any case.

Well, the point is, we emit that message every time we go to recover
from a crash. Presumably the message is as valid after a restart of
crash recovery as it was the first time around.

<thinks>

But maybe the message isn't right the first time either. After all
the point of having a write-ahead log in the first place is that we
should be able to prevent corruption in the event of an unexpected
shutdown. Maybe the right thing to do is to forget about adding a new
state and just remove or change the errhint from these messages:

ereport(LOG, (errmsg("database system was interrupted while in
recovery at %s", str_time(ControlFile->time)),
errhint("This probably means that some data is
corrupted and"
" you will have to use the
last backup for recovery.")));

ereport(LOG, (errmsg("database system was interrupted while in
recovery at log time %s", str_time(ControlFile->checkPointCopy.time)),
errhint("If this has occurred more than once
some data might be corrupted"
" and you might need to choose an earlier
recovery target.")));

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2010-06-02 22:59:23 Re: Keepalive for max_standby_delay
Previous Message Kevin Grittner 2010-06-02 22:03:15 Re: CommitFest FAQ (was: dividing money by money)