ERROR: XLogFlush: request

From: "Nitin Verma" <nitinverma(at)azulsystems(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: ERROR: XLogFlush: request
Date: 2007-04-13 09:52:24
Message-ID: 640150C1BB635E4C9F2F617BA3EFF1D101FBFF01@XCHMTV1.azulsystems.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi All,

xlog.c code from version we use (7.3.2)

/*
* If we still haven't flushed to the request point then we have a
* problem; most likely, the requested flush point is past end of
* XLOG. This has been seen to occur when a disk page has a corrupted
* LSN.
*
* Formerly we treated this as a PANIC condition, but that hurts the
* system's robustness rather than helping it: we do not want to take
* down the whole system due to corruption on one data page. In
* particular, if the bad page is encountered again during recovery
* then we would be unable to restart the database at all! (This
* scenario has actually happened in the field several times with 7.1
* releases. Note that we cannot get here while InRedo is true, but
if
* the bad page is brought in and marked dirty during recovery then
* CreateCheckpoint will try to flush it at the end of recovery.)
*
* The current approach is to ERROR under normal conditions, but only
* WARNING during recovery, so that the system can be brought up even
* if there's a corrupt LSN. Note that for calls from xact.c, the
* ERROR will be promoted to PANIC since xact.c calls this routine
* inside a critical section. However, calls from bufmgr.c are not
* within critical sections and so we will not force a restart for a
* bad LSN on a data page.
*/
if (XLByteLT(LogwrtResult.Flush, record))
elog(InRecovery ? WARNING : ERROR,
"XLogFlush: request %X/%X is not satisfied ---
flushed
only to %X/%X",
record.xlogid, record.xrecoff,
LogwrtResult.Flush.xlogid,
LogwrtResult.Flush.xrecoff);

A java process using postgres 7.3.2, got these errors

java.sql.SQLException: ERROR: XLogFlush: request
0/240169BC is not satisfied --- flushed only to 0/23FFC01C

While these errors where filling the logs, we were able to connect via psql,
and see all the data.

> This has been seen to occur when a disk page has a corrupted LSN
I suppose LSN refers to Logical sector number of a WAL. If that was corrupted
how-come we were able to access it via psql. Is it just an isolated
phenomenon? Does postgres have an auto-recovery for this? If yes did old
connections have stale values of LSN?

Coming to safeguard:

1. Is there any use of restart java process when this happens?
2. Is there any use of or Is it safe to restart postmaster at this time?

What all should be done when this happened? Any suggestions.

-- Nitin

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alexander Presber 2007-04-13 10:15:30 Re: Arrays with Rails?
Previous Message Listmail 2007-04-13 09:25:16 Re: Arrays with Rails?