From: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
---|---|
To: | Nikhil Sontakke <nikhil(dot)sontakke(at)enterprisedb(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: PG signal handler and non-reentrant malloc/free calls |
Date: | 2011-02-28 12:27:08 |
Message-ID: | 4D6B949C.5050903@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 28.02.2011 14:04, Nikhil Sontakke wrote:
> I believe we have a case where not holding off interrupts while doing a
> malloc() can cause a deadlock due to system or libc level locking. In this
> case, a pg_ctl stop in fast mode was resorted to and that caused a backend
> to handle the interrupt when it was inside the malloc call. Now as part of
> the abort processing, in the subtransaction cleanup code path, this same
> backend tried to clear memory contexts, leading to an eventual free() call.
> The free() call tried to take the same lock which was already held by
> malloc() earlier resulting into a deadlock!
Our signal handlers shouldn't try to do anything that complicated.
die(), which handles SIGTERM caused by fast shutdown in backends,
doesn't do abort processing itself. It just sets a global variable.
Unless ImmediateInterruptOK is set, but it's only set around a few
blocking system calls where it is safe to do so. (Checks...) Actually,
md5_crypt_verify() looks suspicious, it does "ImmediateInterruptOK =
true", and then calls palloc() and pfree().
> Will try to get the call stack if needed.
Yes, please.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2011-02-28 13:08:12 | Re: Replication server timeout patch |
Previous Message | Nikhil Sontakke | 2011-02-28 12:04:03 | PG signal handler and non-reentrant malloc/free calls |