Re: "recovery mode"

From: "Steve Wolfe" <steve(at)iboats(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: "recovery mode"
Date: 2001-01-23 17:11:17
Message-ID: 004b01c0855f$856a56e0$50824e40@iboats.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> I don't think recovery mode actually does much in 7.0.* --- I think it's
> just a stub (Vadim might know better though). In 7.1 it means the thing
> is replaying the WAL log after a crash. In any case it shouldn't
> create a lockup condition like that.
>
> The only cases I've ever heard of where a user process couldn't be
> killed with kill -9 are where it's stuck in a kernel call (and the
> kill response is being held off till the end of the kernel call).
> Any such situation is arguably a kernel bug, of course, but that's
> not a lot of comfort.
>
> Exactly which process were you sending kill -9 to, anyway? There should
> have been a postmaster and one backend running the recovery-mode code.
> If the postmaster was responding to connection requests with an error
> message, then I would not say that it was locked up.

I believe that it was a backend that I tried -9'ing. I knew it wasn't
something that good to do, but I had to get it running again. It's amazing
how bold you get when you hear an entire department mumbling about "Why
isn't the site working?". : )

Anyway, I think the problem wasn't in postgres. I rebooted the machine,
and it worked - for about ten minutes. Then, it froze, with the kernel
crapping out. I rebooted it, it lasted about three minutes until the same
thing happened. Reboot, it didn't even get through the fsck before it did
it again.

I looked at the CPU temps, one of the four was warmer than it should be,
but still within acceptable limits (40 C). So, I shut it down, reseated the
RAM chassis, the DIMM's, the CPU's, and the expansion cards. When it came
up, I compiled and put on a newer kernel (I guess there was some good in the
crashes), and then it worked fine. Because of the symptoms, I imagine that
it was a flakey connection. Odd, considering that everything except the
DIMM's (including the CPU's) are literally screwed to the motherboard!

steve

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2001-01-23 17:13:28 Re: Outer Joins
Previous Message Tom Lane 2001-01-23 17:10:46 Re: Another plpgsql question..