Skip site navigation (1) Skip section navigation (2)

Re: Postgres server crash

From: Russell Smith <mr-russ(at)pws(dot)com(dot)au>
To: "Craig A(dot) James" <cjames(at)modgraph-usa(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Postgres server crash
Date: 2006-11-16 02:28:29
Message-ID: 455BCCCD.2020701@pws.com.au (view raw or flat)
Thread:
Lists: pgsql-performance
Craig A. James wrote:
> For the third time today, our server has crashed, or frozen, actually 
> something in between.  Normally there are about 30-50 connections 
> because of mod_perl processes that keep connections open.  After the 
> crash, there are three processes remaining:
>
> # ps -ef | grep postgres
> postgres 23832     1  0 Nov11 pts/1    00:02:53 
> /usr/local/pgsql/bin/postmaster -D /postgres/main
> postgres  1200 23832 20 14:28 pts/1    00:58:14 postgres: pubchem 
> pubchem 66.226.76.106(58882) SELECT
> postgres  4190 23832 25 14:33 pts/1    01:09:12 postgres: asinex 
> asinex 66.226.76.106(56298) SELECT
>
> But they're not doing anything: No CPU time consumed, no I/O going on, 
> no progress.  If I try to connect with psql(1), it says:
>
>   psql: FATAL:  the database system is in recovery mode
>
> And the server log has:
>
> LOG:  background writer process (PID 23874) was terminated by signal 9
> LOG:  terminating any other active server processes
> LOG:  statistics collector process (PID 23875) was terminated by signal 9
> WARNING:  terminating connection because of crash of another server 
> process
> DETAIL:  The postmaster has commanded this server process to roll back 
> the current transaction and exit, because another server process 
> exited ab
> normally and possibly corrupted shared memory.
> HINT:  In a moment you should be able to reconnect to the database and 
> repeat your command.
> WARNING:  terminating connection because of crash of another server 
> process
> DETAIL:  The postmaster has commanded this server process to roll back 
> the current transaction and exit, because another server process 
> exited ab
> ... repeats about 50 times, one per process.
>
> Questions:
>  1. Any idea what happened and how I can avoid this?  It's a *big* 
> problem.
>  2. Why didn't the database recover?  Why are there two processes
>     that couldn't be killed?
>  3. Where did the "signal 9" come from?  (Nobody but me ever logs
>     in to the server machine.)
>
I would guess it's the linux OOM if you are running linux. You need to 
turn off killing of processes when you run out of memory.  Are you 
getting close to running out of memory?

> Help!
>
> Thanks,
> Craig
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>
>


In response to

Responses

pgsql-performance by date

Next:From: AMIR FRANCO D. JOVENDate: 2006-11-16 08:47:23
Subject: Re: Slow SELECT on three or more clients
Previous:From: Craig A. JamesDate: 2006-11-16 02:20:24
Subject: Postgres server crash

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group