kill -9 does kill postmaster (or at least seems to). But I can't figure
out a way to get it restarted without a reboot -- I don't know what I'm
missing. The Fedora postgres restart scripts don't do the trick, and I
couldn't get it to work with pg_ctl either.
kill -9 doesn't work on the locked up httpd processes. So that has to
have the system restarted.
[meme(at)chmmr]$ cat /proc/version
Linux version 126.96.36.199-170.2.104.fc10.i686
(mockbuild(at)xenbuilder4(dot)fedora(dot)phx(dot)redhat(dot)com) (gcc version 4.3.2
20081105 (Red Hat 4.3.2-7) (GCC) ) #1 SMP Mon Oct 12 22:01:53 EDT 2009
Postgres is by default in /var/lib/pgsql. When / started running out of
space, I moved it to /scratch and symlinked:
lrwxrwxrwx 1 root root 15 2009-09-11 16:57 pgsql
/ is on md0 and is RAID-1. /scratch is on md1 and is RAID-6:
[meme(at)chmmr]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/md0 64G 42G 18G 71% /
/dev/md1 2.5T 2.2T 239G 91% /scratch
/dev/sdb1 190M 38M 143M 21% /boot
/dev/sde1 190M 86M 95M 48% /boot2
/dev/sdd1 190M 86M 95M 48% /boot3
/dev/sda1 190M 86M 95M 48% /boot4
/dev/sdc1 190M 86M 95M 48% /boot5
tmpfs 1000M 0 1000M 0% /dev/shm
[meme(at)chmmr]$ cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md1 : active raid6 sde4 sdc4 sda4 sdb4 sdd4
2722005120 blocks level 6, 64k chunk, algorithm 2 [5/5] [UUUUU]
md0 : active raid1 sde3 sdc3 sda3 sdb3 sdd3
67119488 blocks [5/5] [UUUUU]
unused devices: <none>
Both filesystems are EXT-4.
Thanks for your help!
On Sun, 2009-10-25 at 23:13 -0400, Tom Lane wrote:
> Karen Pease <meme(at)daughtersoftiresias(dot)org> writes:
> > It'll get through about three or four of them (out of hundreds) before
> > it locks up. Now, before lockup, postmaster is very active. It shows
> > up on top. The computer's hard drives clack nonstop. Etc. But once it
> > locks up (without warning), all of that stop. Postmaster does nothing.
> > The computer goes silent. I can't ctrl-break the psql process. If I
> > try to start a new psql process, it won't get past the password prompt
> > -- psql will hang. All Apache processes involving postgres queries
> > hang. The postgres server cannot be restarted by any normal means (the
> > only solution I've found that works is a reboot). And so forth.
> This sounds to me like it's a kernel problem, possibly triggered by
> misbehaving disk hardware. What you might try to confirm is a kill -9
> on whichever postgres backend seems to be stuck. If that fails to
> remove the process, then it's definitely a kernel issue --- try googling
> "uninterruptible disk wait" and similar phrases.
> The cases that I've run into personally have been due to poor error
> handling for a disk failure condition in a kernel-level disk driver.
> If that's what it is for you, the bottom-level problem might be an
> unreadable disk block somewhere. Or it might just be a garden variety
> kernel bug. What's the platform?
> regards, tom lane
In response to
pgsql-bugs by date
|Next:||From: Pavel Stehule||Date: 2009-10-26 04:43:14|
|Subject: Re: BUG #5136: Please drop the string literal syntax for CREATE FUNCTION ...|
|Previous:||From: Tom Lane||Date: 2009-10-26 03:13:34|
|Subject: Re: Postmaster hangs |