On 26/10/2009 5:28 PM, Karen Pease wrote:
> I did my best to follow the gdb instructions. I ran:
> gdb -p 2852
> Then connected entered the logging statements, then ran "cont", then
> ctrl-c'ed it a couple times. I got:
OK, so there's nothing shrieklingly obviously wrong with what the
postmaster is up to. But what about the backend that's stopped
responding? Try connecting gdb to that "postgres" process once it's
stopped responding and get a backtrace from that.
> [root(at)chmmr dbscripts]# ps ax -o pid,ppid,stat,wchan:50,cmd | grep -i
> 3376 1 D
> start_this_handle /usr/sbin/httpd
start_this_handle appears in common ext4 call paths, and several lkml
issue reports over time:
Smells like kernel bug. When looking at two extremely stable pieces of
software (Pg and apache) both having issues on a well tested kernel
(Linux) with a new and fairly immature file system in use (ext4) it's
probably not an unreasonable assumption.
You can find out a bit more about what the kernel is doing using the
"magic" keyboard sequence "ALT-SysRQ-T" from a vconsole (not under X).
If the results scroll past too fast you can page through them with
"less" on /var/log/kern.log (or /var/log/dmesg depending on your distro)
or using the "dmesg" command.
I won't be too surprised if you see a kernel stack trace for your httpd
process(es) starting something like this:
In response to
pgsql-bugs by date
|Next:||From: Tom Lane||Date: 2009-10-26 13:52:32|
|Subject: Re: Postmaster hangs |
|Previous:||From: Karen Pease||Date: 2009-10-26 09:28:44|
|Subject: Re: Postmaster hangs|