From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Postmaster self-deadlock due to PLT linkage resolution |
Date: | 2022-08-30 12:16:36 |
Message-ID: | CA+hUKGJJPvf=nQXQwsb91EoEvMW89+qLmRpE5rG3O24HpMmkcw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Aug 30, 2022 at 7:44 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Buildfarm member mamba (NetBSD-current on prairiedog's former hardware)
> has failed repeatedly since I set it up. I have now run the cause of
> that to ground [1], and here's what's happening: if the postmaster
> receives a signal just before it first waits at the select() in
> ServerLoop, it can self-deadlock. During the postmaster's first use of
> select(), the dynamic loader needs to resolve the PLT branch table entry
> that the core executable uses to reach select() in libc.so, and it locks
> the loader's internal data structures while doing that. If we enter
> a signal handler while the lock is held, and the handler needs to do
> anything that also requires the lock, the postmaster is frozen.
. o O ( pselect() wouldn't have this problem, but it's slightly too
new for the back branches that didn't yet require SUSv3... drat )
> I'd originally intended to make this code "#ifdef __NetBSD__",
> but on looking into the FreeBSD sources I find much the same locking
> logic in their dynamic loader, and now I'm wondering if such behavior
> isn't pretty standard. The added calls should have negligible cost,
> so it doesn't seem unreasonable to do them everywhere.
FWIW I suspect FreeBSD can't break like this in a program linked with
libthr, because it has a scheme for deferring signals while the
runtime linker holds locks. _rtld_bind calls _thr_rtld_rlock_acquire,
which uses the THR_CRITICAL_ENTER mechanism to cause thr_sighandler to
defer until release. For a non-thread program, I'm not entirely sure,
but I don't think the fork() problem exists there. (Could be wrong,
based on a quick look.)
> (Of course, a much better answer is to get out of the business of
> doing nontrivial stuff in signal handlers. But even if we get that
> done soon, we'd surely not back-patch it.)
+1
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2022-08-30 12:24:41 | Re: replacing role-level NOINHERIT with a grant-level option |
Previous Message | Peter Eisentraut | 2022-08-30 11:53:49 | Re: Transparent column encryption |