Re: Theory about XLogFlush startup failures

From: Thomas Lockhart <lockhart(at)fourpalms(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Mikheev, Vadim" <vmikheev(at)SECTORBASE(dot)COM>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Theory about XLogFlush startup failures
Date: 2002-01-27 01:16:45
Message-ID: 3C5354FD.504DB89C@fourpalms.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

...
> > ... It's not Ok automatically restart
> > knowing about errors in data.
...
> At worst it tells us that one data page is corrupt. A robust system
> should report that and keep plugging.

Hmm. I'm not sure that this needs an "either-or" resolution on the
general topic of error recovery. Back when I used Ingres, it had the
feature that corruption would mark the database as "readonly" (a mode
I'd like us to have -- even without errors -- to help support upgrades,
updates, and error handling). So an administrator could evaluate the
damage without having further damage caused, but could allow users to
rummage through database at the same time.

I have a hard time believing that we should *always* allow the database
to keep writing in the face of *any* detected error. I'm sure that is
not what Tom is saying, but in this case could further damage be caused
by subsequent writing when we *already* know that there is some
corruption? If so, we should consider supporting some sort of error
state that prevents further damage. Vadim's solution uses the only
current mechanism available, which is to force the database to shut down
until it can be evaluated. But if we had some stronger mechanisms to
support limited operation, that would might help in this case and would
certainly help in other situations too.

- Thomas

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2002-01-27 03:53:52 Re: Theory about XLogFlush startup failures
Previous Message Don Baccus 2002-01-27 01:06:13 Re: contrib/tree