Re: Problems starting up postgres

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Denis Perchine <dyp(at)perchine(dot)com>
Cc: Vadim Mikheev <vmikheev(at)sectorbase(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Problems starting up postgres
Date: 2001-09-06 13:49:20
Message-ID: 1703.999784160@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Denis Perchine <dyp(at)perchine(dot)com> writes:
> Sep 6 02:09:30 mx postgres[13468]: [9] FATAL 2: XLogFlush: request(1494286336, 786458) is not satisfied --
> flushed to (23, 2432317444)

Yeek. Looks like you have a page somewhere in the database with a bogus
LSN value (xlog pointer) ... and, most likely, other corruption as well.

>> BTW, how did you get into this state --- did you have a system crash?

> Yes. I was forced to fsck.

Okay. As a temporary recovery measure, I'd suggest reducing that
particular elog from STOP to DEBUG level. That will let you start up
and run the database. You'll need to look through your tables and try
to figure out which one(s) have lost data. It might be interesting to
try to figure out just which page has the bad LSN value --- that might
give us a clue why the WAL did not provide protection against this
failure. Unfortunately XLogFlush doesn't have any idea who its caller
is, so the only way I can think of to check that directly is to set a
breakpoint at this error report and look at the call stack.

Vadim, what do you think of reducing this elog from STOP to a notice
on a permanent basis? ISTM we saw cases during 7.1 beta where this
STOP prevented people from recovering, so I'm thinking it does more
harm than good to overall system reliability.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2001-09-06 13:51:54 Re: Log rotation?
Previous Message David Wheeler 2001-09-06 13:22:22 Re: Bug in createlang?