Re: Theory about XLogFlush startup failures

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Mikheev, Vadim" <vmikheev(at)SECTORBASE(dot)COM>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Theory about XLogFlush startup failures
Date: 2002-01-26 17:52:34
Message-ID: 11414.1012067554@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Mikheev, Vadim" <vmikheev(at)SECTORBASE(dot)COM> writes:
>> So I am still dissatisfied with doing elog(STOP) for this condition,
>> as I regard it as an overly strong reaction to corrupted data;
>> moreover, it does nothing to fix the problem and indeed gets in
>> the way of fixing the problem.

> ... It's not Ok automatically restart
> knowing about errors in data.

Actually, I disagree. If we come across clearly corrupt data values
(eg, bad length word for a varlena item, or even tuple-header errors
such as a bad XID), we do not try to force the admin to restore the
database from backup, do we? A bogus LSN is bad, certainly, but it
is not the end of the world and does not deserve a panic reaction.
At worst it tells us that one data page is corrupt. A robust system
should report that and keep plugging.

What would be actually useful here is to report which page contains
the bad LSN, so that the admin could look at it and decide what to do.
xlog.c doesn't know that, unfortunately. I'd be more interested in
expending work to make that happen than in expending work to make
a dbadmin's life more difficult --- and I rank forced stops in the
latter category.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2002-01-26 19:07:36 Re: contrib/tree
Previous Message G.Nagarajan 2002-01-26 16:14:31 Re: Bad Timestamp Format