Severe Badness On My Server: psql: FATAL: the database system is starting up

From: Mitchell Laks <mlaks(at)verizon(dot)net>
To: "Pgsql-Admin (E-mail)" <pgsql-admin(at)postgresql(dot)org>
Subject: Severe Badness On My Server: psql: FATAL: the database system is starting up
Date: 2005-03-13 16:12:00
Message-ID: 200503131112.01120.mlaks@verizon.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Dear Gurus:
My Server and me have had a very bad weekend, starting Friday afternoon.

I am running Debian Sarge, Postgresql 7.4.6 with linux kernel 2.6.8.

I am running a Postgresql backed application on a remote server. The system
has a system drive, on which the Postgresql database runs and there is a raid
1 drive on which the application stores data.

Well, the raid1 failed (or is failing - or is trying its hardest to fail, not
clear yet...). This should not have affected the Postgresql database as it is
safely on a separate drive.

However, when i logged onto the system, I found that I could not turn off
postgresql. I logged in as postgres, did pg_ctl stop and it did ....... and
then could not stop (presumably because hanging client applications were not
loged off the database).

So then I killed all the application clients (kill -9 of them), and still I
tried to pg_ctl stop and it did not want to stop.

So I looked in ps aux and the client applications looked like they were in D
status in ps aux.

wustl 18232 0.0 0.2 4872 1920 ? D Mar11
0:00 /usr/local/ctn/bi

I then tried to reboot system remotely via login as root and shutdown -r now
and even shutdown -h now. Interestingly enough (I have never ever seen this -
system refused to shutdown!!!!!!!).

I was floored! Well what to do? I decided to sleep on it.

Well I logged in then on saturday night and system was still hanging in this
bizarre state. I now saw qued shutdown requests in the ps aux. And nothing
was happening fast.

I thought. I read a little. I tried pg_ctl stop -m fast. It did nothing. I
prayed. I tried to do pg_dump LTA_IDB >lta_idb.dump to dump the database in
question. It didnt do anything.

I was desparate. I decided to try desparate measures I then pulled the gun

pg_ctl stop -m i.

OK so it stopped. Then I said let me try to dump the database and so I did
pg_ctl start. It started

postgres(at)A1:~$ pg_ctl status
pg_ctl: postmaster is running (PID: 21195)
Command line was:
/usr/lib/postgresql/bin/postmaster

Then I tried to dump the database and i got some message about the fact that
Fatal the database was starting. I waited a while and then I tried again.
same message. I then tried as user of the database psql LTA_IDB and message
Fatal the database is starting.

Then I tried psql LTA_IDB and got Fatal database is starting.

I waited. Then I did pg_ctl stop (I dont know why i did it. Perversity I
think.)

It then said to me
................ something about unable to stop.

Then I did

postgres(at)A1:~$ pg_dump LTA_IDB>lta_idb.dump
2005-03-13 10:56:33 [21481] LOG: connection received: host=[local] port=
2005-03-13 10:56:33 [21481] FATAL: the database system is shutting down
pg_dump: [archiver (db)] connection to database "LTA_IDB" failed: FATAL: the
dn

Now I did
pg_ctl status
postgres(at)A1:~$ pg_ctl status
pg_ctl: postmaster is running (PID: 21195)
Command line was:
/usr/lib/postgresql/bin/postmaster

OK I feel like I am in the twilight zone.

Next I did as root
cd /var/log
ls postg*

A1:/var/log# ls post*
postgres.log postgres.log.2.gz postgres.log.5.gz postgres.log.8.gz
postgres.log.1 postgres.log.3.gz postgres.log.6.gz postgres.log.9.gz
postgres.log.10.gz postgres.log.4.gz postgres.log.7.gz
A1:/var/log# less postgres.log
postgres.log: No such file or directory

WHAT????????
df -h
/dev/sda2 9.2G 2.8G 6.0G 32% /
tmpfs 443M 0 443M 0% /dev/shm
/dev/sda1 89M 11M 74M 13% /boot
/dev/sda3 7.4G 273M 6.7G 4% /home
/dev/sda8 11G 33M 9.9G 1% /mirror
/dev/sda7 449M 8.1M 417M 2% /tmp
/dev/sda6 7.4G 4.7G 2.4G 67% /var
/dev/md0 230G 139G 80G 64% /home/big0

I am in the twilight zone. My sanity is suspect. Any ideas on what to do next?
Pull the plug????
Mitchell

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Geoffrey 2005-03-13 17:24:00 Re: Too frequent warnings for wraparound failure
Previous Message Milen A. Radev 2005-03-12 16:34:19 Re: Too frequent warnings for wraparound failure