From: | Christopher Cashell <topher-pgsql(at)zyp(dot)org> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Problems restarting after database crashed (signal 11). |
Date: | 2004-07-01 02:37:58 |
Message-ID: | 20040701023758.GB30122@zyp.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
At Wed, 30 Jun 04, Unidentified Flying Banana Tom Lane, said:
> Christopher Cashell <topher-pgsql(at)zyp(dot)org> writes:
> > Eventually I attempted to shut it down and restart it, however that
> > failed too. When I attempted to shut it down, I discovered a hung
> > 'startup subprocess' that can't be killed.
>
> This is interesting because it seems just about exactly like this
> recent Red Hat bug report:
> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=126885
Hrm. Yes, it does appear to be a very similar, if not identical, issue.
> As I commented there, I think that it must be a kernel or hardware
> issue --- Postgres itself can surely not make an unkillable process.
> However it's common to see processes that don't respond to kill if
> they are stuck inside a kernel I/O request. That could mean either
> unresponsive hardware or a kernel bug.
That is somewhat along the lines of what I was thinking, although I have
had no problems like this before. The machine has been running for over
100 days, and the database as well, without issue.
28424 postgres 18 0 16804 3044 15m D 0.0 1.6 0:06.72 postmaster
Note that it does have a process status of 'D', or uninterruptible
sleep. That would explain the unkillable part, though I'm curious how
it ended up there. Unless it just happened to be in a really bad spot
when Posgres segfaulted. . . although, I wouldn't expect that would
affect the 'startup subprocess'.
> I wonder whether you have any similarities in hardware or Linux kernel
> to the person who filed the above report?
Here's all the information I can provide for this machine:
IBM IntelliStation Z Pro
Model: 6899-12U
Dual Pentium Pro 200
192MB RAM
4.5 GB IBM SCSI HDD
9 GB IBM SCSI HDD
6.4 GB WD HDD
The database resides on the 4.5 GB SCSI, with the pg_xlog directory
symlinked from there, and actually existing on the 9GB SCSI.
nexus:~$ uname -a
Linux nexus.zyp.org 2.6.4 #1 SMP Thu Mar 11 14:04:49 CST 2004 i686 GNU/Linux
nexus:~$ uptime
21:15:39 up 107 days, 20:57, 7 users, load average: 2.04, 2.31, 2.38
If there's any other information I can provide, please let me know.
I'm going to reboot the box right now, and cross my fingers, hoping
it'll come back up. ;-)
> regards, tom lane
--
| Christopher
+------------------------------------------------+
| Here I stand. I can do no other. |
+------------------------------------------------+
From | Date | Subject | |
---|---|---|---|
Next Message | Bill Moran | 2004-07-01 03:08:13 | Re: Dump / restore for optimization? |
Previous Message | Tom Lane | 2004-07-01 01:54:26 | Re: Problems restarting after database crashed (signal 11). |