Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
postmaster.c (r1.580 -> r1.581)
ipci.c (r1.99 -> r1.100)
pmsignal.c (r1.26 -> r1.27)
proc.c (r1.205 -> r1.206)
globals.c (r1.107 -> r1.108)
miscadmin.h (r1.209 -> r1.210)
postmaster.h (r1.19 -> r1.20)
pmsignal.h (r1.23 -> r1.24)
pgsql-committers by date
|Next:||From: Tom Lane||Date: 2009-05-05 20:06:07|
|Subject: pgsql: Install an atexit(2) callback that ensures that proc_exit's |
|Previous:||From: Tom Lane||Date: 2009-05-05 19:36:32|
|Subject: pgsql: Insert CHECK_FOR_INTERRUPTS() calls into btree and hash index |