Re: pgsql error

From: "Mcleod, John" <johnm(at)spicergroup(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: pgsql error
Date: 2011-07-26 12:47:27
Message-ID: A2FA197FFED9FA42AD080EE933EFA8AE34C394A7@Spicer-mail.spicergroup.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thank you for the reply.

At command line, I ran...
"psql --version"
and received..
"psql (PostgreSQL) 7.5devel"

The database is sitting on a Windows 2003 Server box.
A mapping application, wrote in PHP, runs with Apache 2.05

I know in the past, the project manager would restart the database by just closing the .bat window, then restart by double-clicking the postgis.bat file on the desktop.
I'm not sure if this was the beginning of the problem. I've learned to shutdown the database by "Ctrl C".

This batch file has the following...

cd c:\
cd ms4w/apps/pgsql75win/data/
del postmaster.pid

@ECHO OFF
set
PATH=%PATH%; \ms4w\apps\pgsql75win\lib;\ms4w\apps\pgsql75win\bin;\ms4w\apps\pgsql75win\share\contrib

cd c:\
cd ms4w/apps/pgsql75win/
cmd /c "postmaster -D \ms4w\apps\pgsql75win\\data"

I hope this will give you some clues.

John

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Monday, July 25, 2011 11:20 PM
To: Merlin Moncure
Cc: Mcleod, John; pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] pgsql error

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> On Mon, Jul 25, 2011 at 3:05 PM, Mcleod, John <johnm(at)spicergroup(dot)com> wrote:
>> I'm receiving the following error
>> CONTEXT: writing block 614 of relation 394198/412175
>> WARNING: could not write block 614 of 394198/412175
>> DETAIL: Multiple failures --- write error may be permanent.
>> ERROR: xlog flush request 0/34D53680 is not satisfied --- flushed
>> only to
>> 0/34CD1EB0

> This is a fairly low level error that is telling you the WAL could not
> be written out. Out of drive space? Data corruption?

Yeah, this looks like the detritus of some previous failure. There are basically two possibilities:

1. The problem page's LSN field has gotten trashed so that it appears to be past the end of WAL.

2. The page actually did get updated by a WAL entry with that LSN, and then there was a crash for some reason, and the database tried to recover by replaying WAL, and it hit some problem that caused it to stop recovering before what had really been the end of WAL. So now it thinks the end of WAL is 0/34CD1EB0, but there are page(s) out there with LSNs past that, and when it finds one you start getting complaints like this.

I doubt theory #1, though, because there are nearby fields in a page header that evidently weren't trashed or else the page would have been recognized as being corrupt. Also the reported LSN is not very far past end of WAL, which would be unlikely in the event of random corruption.
So I'm betting on #2.

Unfortunately this tells us little about either the cause of the original crash, or the reason why recovery didn't work properly. We'd need a lot more information before speculating about that, for starters the exact Postgres version and the platform it's running on.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Merlin Moncure 2011-07-26 13:40:14 Re: pgsql error
Previous Message Sim Zacks 2011-07-26 08:04:02 Re: Implementing "thick"/"fat" databases