Re: Strange failure on mamba

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Strange failure on mamba
Date: 2022-12-01 00:36:09
Message-ID: 1473980.1669854969@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2022-11-30 18:33:06 -0500, Tom Lane wrote:
>> Even if somebody comes up with a rewrite to avoid doing interesting stuff in
>> the postmaster's signal handlers, we surely wouldn't risk back-patching it.

> Would that actually fix anything, given netbsd's brokenness? If we used a
> latch like mechanism, the signal handler would still use functions in libc. So
> postmaster could deadlock, at least during the first execution of a signal
> handler? So I think 8acd8f869 continues to be important...

I agree that "-z now" is a good idea for performance reasons, but
what we're seeing is that it's only a partial fix for netbsd's issue,
since it doesn't apply to shared libraries that the postmaster pulls
in.

I'm not sure about your thesis that things are fundamentally broken.
It does seem like if a signal handler does SetLatch then that could
require PLT resolution, and if it interrupts something else doing
PLT resolution then we have a problem. But if it were a live
problem then we'd have seen instances outside of the postmaster's
select() wait, and we haven't.

I'm kind of inclined to band-aid that select() call as previously
suggested, and see where we end up.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2022-12-01 01:30:44 Re: Allow round() function to accept float and double precision
Previous Message Andres Freund 2022-12-01 00:19:57 Re: Strange failure on mamba