Re: pgsql: Get rid of the dedicated latch for signaling the startup process

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fujii Masao <fujii(at)postgresql(dot)org>, pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: Re: pgsql: Get rid of the dedicated latch for signaling the startup process
Date: 2020-11-04 11:20:09
Message-ID: c50bee3b-f7b6-1052-c161-699ec83fdf85@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

On 2020/11/04 19:27, Heikki Linnakangas wrote:
> On 04/11/2020 09:44, Fujii Masao wrote:
>> Get rid of the dedicated latch for signaling the startup process.
>>
>> This commit gets rid of the dedicated latch for signaling the startup
>> process in favor of using its procLatch,  since that comports better
>> with possible generic signal handlers using that latch.
>>
>> Commit 1e53fe0e70 changed background processes so that they use standard
>> SIGHUP handler. Like that, this commit also makes the startup process use
>> standard SIGHUP handler to simplify the code.
>
> This seems to have made buildfarm member 'elver' to segfault. I've got a hunch that setting recoveryWakeupLatch to NULL, when WakeupRecovery() doesn't check for NULL, is not OK. It's surprising that we're only seeing this on 'elver', though.

Thanks for the report! The latch is reset to NULL after ShutdownWalRcv(). So I thought that there are no processes setting the latch (i.e., walreceiver has already exited) after it's reset to NULL. But this is not true. There seem a window between ShutdownWalRcv() and the actual exit of walreceiver. If a signal is sent during that window, the segmentation fault would happen. This would be the reason why segv happened in some platforms, but not in others.

I'm thinking to remove the following code to fix this issue. Thought?

/*
* We don't need the latch anymore. It's not strictly necessary to reset
* it to NULL, but let's do it for the sake of tidiness.
*/
if (ArchiveRecoveryRequested)
XLogCtl->recoveryWakeupLatch = NULL;

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Peter Eisentraut 2020-11-04 11:50:18 pgsql: Enable hash partitioning of text arrays
Previous Message Heikki Linnakangas 2020-11-04 10:27:53 Re: pgsql: Get rid of the dedicated latch for signaling the startup process