Re: FATAL: the database system is in recovery mode

From: Ron St-Pierre <rstpierre(at)syscor(dot)com>
To: Doug McNaught <doug(at)mcnaught(dot)org>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: FATAL: the database system is in recovery mode
Date: 2004-07-07 21:42:26
Message-ID: 40EC6E42.6090507@syscor.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Doug McNaught wrote:

>Ron St-Pierre <rstpierre(at)syscor(dot)com> writes:
>
>
>
>>We're developing a java app and are using postgres as the database. On
>>our dev server I started the app, closed it, but the java process was
>>still open so I killed it, which caused the above error. I've had to
>>do this in the past but have not had this happen before. I've searched
>>the archives and found a message/reply from Andrew Sullivan about this
>>which implies that it may be more a file system problem. We're using
>>postgresql 7.4.1 on a debian box (and yes, ext2). And fsync is turned
>>on.
>>
>>
>
>Killing a client process shouldn't cause fatal errors on the
>server--the most you should get is "client unexpectedly closed
>connection". Something else is wrong.
>
>
>
>>Aside from not killing processes which hang, is there anything I can
>>do to avoid this problem in the future?
>>
>>
>
>What is the actual problem with Postgres? Did the server die? Is it
>refusing to start up? If you can give more complete logs that would
>be helpful.
>
>-Doug
>
>
>
It looks as if it hung up trying to write some customer preferences to
the db. Here's the snippet from /var/log/messages:

Jul 7 09:32:58 imp postgres[32351]: [550-1] LOG: statement: UPDATE
customer.clientPreferences SET preferences='<?xml version="1.0"
encoding="UTF-8"?>
Jul 7 09:32:58 imp postgres[32351]: [550-2] ^I<java version="1.4.2_03"
class="java.beans.XMLDecoder">
Jul 7 09:32:58 imp postgres[32351]: [550-3] ^I <object
class="java.util.HashMap">
Jul 7 09:32:58 imp postgres[32351]: [550-4] ^I <void method="put">

.... thousand(s) lines of XML ...

Jul 7 09:32:58 imp postgres[32351]: [550-1626] ^I <void
method="put">
Jul 7 09:32:58 imp postgres[26784]: [448-1] LOG: server process (PID
32349) was terminated by signal 11
Jul 7 09:32:58 imp postgres[26784]: [449-1] LOG: terminating any other
active server processes
Jul 7 09:32:58 imp postgres[32352]: [567-1] WARNING: terminating
connection because of crash of another server process
Jul 7 09:32:58 imp postgres[32350]: [482-1] WARNING: terminating
connection because of crash of another server process
Jul 7 09:32:58 imp postgres[32348]: [450-1] WARNING: terminating
connection because of crash of another server process
Jul 7 09:32:58 imp postgres[32352]: [567-2] DETAIL: The postmaster has
commanded this server process to roll back the current transaction and
exit, because another server
Jul 7 09:32:58 imp postgres[32352]: [567-3] process exited abnormally
and possibly corrupted shared memory.
Jul 7 09:32:58 imp postgres[32350]: [482-2] DETAIL: The postmaster has
commanded this server process to roll back the current transaction and
exit, because another server
Jul 7 09:32:58 imp postgres[32348]: [450-2] DETAIL: The postmaster has
commanded this server process to roll back the current transaction and
exit, because another server
Jul 7 09:32:58 imp postgres[32352]: [567-4] HINT: In a moment you
should be able to reconnect to the database and repeat your command.
Jul 7 09:32:58 imp postgres[32350]: [482-3] process exited abnormally
and possibly corrupted shared memory.
Jul 7 09:32:58 imp postgres[32350]: [482-4] HINT: In a moment you
should be able to reconnect to the database and repeat your command.
Jul 7 09:32:58 imp postgres[32348]: [450-3] process exited abnormally
and possibly corrupted shared memory.
Jul 7 09:32:58 imp postgres[32348]: [450-4] HINT: In a moment you
should be able to reconnect to the database and repeat your command.
Jul 7 09:32:58 imp postgres[32351]: [551-1] ERROR: terminating
connection because of crash of another server process
^ ^
^ ^ ^
Jul 7 09:32:58 imp postgres[32351]: [551-2] DETAIL: The postmaster has
commanded this server process to roll back the current transaction and
exit, because another server
Jul 7 09:32:58 imp postgres[32351]: [551-3] process exited abnormally
and possibly corrupted shared memory.
Jul 7 09:32:58 imp postgres[32351]: [551-4] HINT: In a moment you
should be able to reconnect to the database and repeat your command.
Jul 7 09:32:58 imp postgres[19223]: [432-1] WARNING: terminating
connection because of crash of another server process
Jul 7 09:32:58 imp postgres[19223]: [432-2] DETAIL: The postmaster has
commanded this server process to roll back the current transaction and
exit, because another server
... more ...
Jul 7 09:32:58 imp postgres[3802]: [295-3] process exited abnormally
and possibly corrupted shared memory.
Jul 7 09:32:58 imp postgres[3802]: [295-4] HINT: In a moment you
should be able to reconnect to the database and repeat your command.
Jul 7 10:02:24 imp -- MARK --
Jul 7 10:22:24 imp -- MARK --
Jul 7 10:42:24 imp -- MARK --
Jul 7 11:02:24 imp -- MARK --
Jul 7 11:22:24 imp -- MARK --
Jul 7 11:42:24 imp -- MARK --
Jul 7 12:02:24 imp -- MARK --
Jul 7 12:22:24 imp -- MARK --
Jul 7 12:42:24 imp -- MARK --
Jul 7 13:02:24 imp -- MARK --
Jul 7 13:22:24 imp -- MARK --
Jul 7 13:42:24 imp -- MARK --
Jul 7 14:00:36 imp postgres[32351]: [552-1] LOG: could not receive
data from client: Connection reset by peer
Jul 7 14:00:36 imp postgres[32351]: [553-1] LOG: unexpected EOF on
client connection
Jul 7 14:00:36 imp postgres[26784]: [450-1] LOG: all server processes
terminated; reinitializing
Jul 7 14:00:36 imp postgres[7962]: [451-1] LOG: database system was
interrupted at 2004-07-07 09:31:44 PDT
Jul 7 14:00:36 imp postgres[7962]: [452-1] LOG: checkpoint record is
at 5E/7DB31300
Jul 7 14:00:36 imp postgres[7962]: [453-1] LOG: redo record is at
5E/7DB31300; undo record is at 0/0; shutdown FALSE
Jul 7 14:00:36 imp postgres[7962]: [454-1] LOG: next transaction ID:
21787900; next OID: 125814322
Jul 7 14:00:36 imp postgres[7962]: [455-1] LOG: database system was
not properly shut down; automatic recovery in progress
Jul 7 14:00:36 imp postgres[7962]: [456-1] LOG: record with zero
length at 5E/7DB31340
Jul 7 14:00:36 imp postgres[7962]: [457-1] LOG: redo is not required
Jul 7 14:00:38 imp postgres[7962]: [458-1] LOG: database system is ready

^ ^ ^ ^
I just checked it again, and as you can see from the logs it has
restarted. Looking at the times in the logs, it crashed at 09:32:58 and
restarted at 14:00:36, but there's nothing scheduled in cron that I can
see which would cause the restart. Odd. Normally I would have restarted
it manually with pg_ctl but I wanted to see what would happen so I left it.

Anyway my original question still stands, any idea how to avoid this
problem in the future?

Thanks
Ron

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Doug McNaught 2004-07-07 21:47:28 Re: FATAL: the database system is in recovery mode
Previous Message Chris Smith 2004-07-07 21:40:24 Re: ERROR: missing chunk number 0 for toast value 14227980