Re: [ADMIN] recovery is stuck when children are not processing SIGQUIT from previous crash

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [ADMIN] recovery is stuck when children are not processing SIGQUIT from previous crash
Date: 2009-12-09 13:57:42
Message-ID: 1260367062.8753.8.camel@fsopti579.F-Secure.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-hackers

[moved to -hackers]

On tor, 2009-11-12 at 22:37 +0200, Marko Kreen wrote:
> On 11/12/09, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Marko Kreen <markokr(at)gmail(dot)com> writes:
> > > You talked about blocking in quickdie(), but you'd need
> > > to block in elog().
> >
> > I'm not really particularly worried about that case. By that logic,
> > we could not use quickdie at all, because any part of the system
> > might be doing something that wouldn't survive being interrupted.
>
> Not really - we'd need to care only about parts that quickdie()
> (or any other signal handler) wants to use. Which basically means
> elog() only.
>
> OK, full elog() is a beast, but why would SIGQUIT handler need full
> elog()? How about we export minimal log-writing function and make
> that signal-safe - that is, drop message if already active. This
> will excange potential crash/deadlock with lost msg which seems
> slightly better behaviour.

Yeah, on reflection, calling elog in the SIGQUIT handler is just waiting
for trouble. The call could block for any number of reasons, because
there is a boatload of functionality that comes with a logging call. In
the overall scheme of things, you don't really lose much if you just
delete the call altogether, because in the event that it's called the
postmaster will already have logged that it is going to kill all
children. Or there ought to be some kind of alarm that would abort the
thing if it takes too long.

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Bradley Kieser 2009-12-09 14:08:02 Re: Cannot increase connection limit?
Previous Message Peter Eisentraut 2009-12-09 13:50:54 Re: [ADMIN] recovery is stuck when children are not processing SIGQUIT from previous crash

Browse pgsql-hackers by date

  From Date Subject
Next Message Zdenek Kotala 2009-12-09 14:04:29 Re: [PATCH] dtrace probes for memory manager
Previous Message Ing. Marcos Ortiz Valmaseda 2009-12-09 13:56:52 Re: What happened to pl/proxy and FDW?