Re: pgsql: Get rid of the dedicated latch for signaling the startup process

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fujii Masao <fujii(at)postgresql(dot)org>, pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: Re: pgsql: Get rid of the dedicated latch for signaling the startup process
Date: 2020-11-05 14:32:14
Message-ID: ee9eeaec-a45f-f821-e0d6-fd3846c515ec@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

On 2020/11/05 6:02, Fujii Masao wrote:
>
>
> On 2020/11/05 5:36, Heikki Linnakangas wrote:
>> On 04/11/2020 15:17, Heikki Linnakangas wrote:
>>> On 04/11/2020 14:03, Fujii Masao wrote:
>>>> Or ISTM that WakeupRecovery() should set the latch only when the latch
>>>> has not been reset to NULL yet.
>>>
>>> Got to be careful with race conditions, if the latch is set to NULL at
>>> the same time that WakeupRecovery() is called.
>>
>> I don't think commit 113d3591b8 got this quite right:
>>
>>> void
>>> WakeupRecovery(void)
>>> {
>>>     if (XLogCtl->recoveryWakeupLatch)
>>>         SetLatch(XLogCtl->recoveryWakeupLatch);
>>> }
>>
>> If XLogCtl->recoveryWakeupLatch is set to NULL between the if and the SetLatch, you'll still get a segfault. That's highly unlikely to happen in practice because the compiler will optimize that into a single load instruction, but could happen with -O0. I think you'd need to do the access only once, using a volatile pointer, to make that safe.

On second thought, since fetching the latch pointer might not be atomic,
maybe we need to use spinlock like WalRcvForceReply() does. But using
spinlock in every calls of WakeupRecovery() might cause performance
overhead. So I'm thinking to use spinlock only when it's necessary, i.e.,
when the latch may be reset to NULL concurrently. Attached is the POC
patch implementing that. Thought?

> Maybe it's simpler to just not reset it to NULL, after all.
>
> Yes, you're right. I agree it's simpler to remove the code resetting
> the latch to NULL. Also as the comment for that code explains,
> basically it's not necessary to reset it to NULL.

On second thought, your comment in upthread "it seems a bit sloppy to
leave it pointing to a latch that belongs to a random process though."
seems right. So I'm inclined to avoid this approach, for now.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Attachment Content-Type Size
fix_possible_segfault_v1.patch text/plain 3.1 KB

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2020-11-05 14:50:19 Re: pgsql: Add pg_depend.refobjversion.
Previous Message Peter Eisentraut 2020-11-05 06:15:10 Re: pgsql: Add pg_depend.refobjversion.