Re: SyncRepLock acquired exclusively in default configuration

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: Asim Praveen <pasim(at)vmware(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Ashwin Agrawal <ashwinstar(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "Xin (Shin) Zhang (Pivotal)" <xzhang(at)pivotal(dot)io>
Subject: Re: SyncRepLock acquired exclusively in default configuration
Date: 2020-09-02 01:58:58
Message-ID: bb51edac-e278-4b66-0b47-2b8fe6ccc65a@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/08/28 21:20, Masahiko Sawada wrote:
> On Fri, 28 Aug 2020 at 10:33, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>
>>
>>
>> On 2020/08/27 15:59, Asim Praveen wrote:
>>>
>>>> On 26-Aug-2020, at 11:10 PM, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>>>
>>>> I added the following comments based on the suggestion by Sawada-san upthread. Thought?
>>>>
>>>> + * Since this routine gets called every commit time, it's important to
>>>> + * exit quickly if sync replication is not requested. So we check
>>>> + * WalSndCtl->sync_standbys_define without the lock and exit
>>>> + * immediately if it's false. If it's true, we check it again later
>>>> + * while holding the lock, to avoid the race condition described
>>>> + * in SyncRepUpdateSyncStandbysDefined().
>>>>
>>>
>>> +1. May I suggest the following addition to the above comment (feel free to
>>> rephrase / reject)?
>>>
>>> "If sync_standbys_defined was being set from false to true and we observe it as
>>> false, it ok to skip the wait. Replication was async and it is in the process
>>> of being changed to sync, due to user request. Subsequent commits will observe
>>> the change and start waiting.”
>>
>> Thanks for the suggestion! I'm not sure if it's worth adding this because
>> it seems obvious thing. But maybe you imply that we need to comment
>> why the lock is not necessary when sync_standbys_defined is false. Right?
>> If so, what about updating the comments as follows?
>>
>> + * Since this routine gets called every commit time, it's important to
>> + * exit quickly if sync replication is not requested. So we check
>> + * WalSndCtl->sync_standbys_defined flag without the lock and exit
>> + * immediately if it's false. If it's true, we need to check it again later
>> + * while holding the lock, to check the flag and operate the sync rep
>> + * queue atomically. This is necessary to avoid the race condition
>> + * described in SyncRepUpdateSyncStandbysDefined(). On the other
>> + * hand, if it's false, the lock is not necessary because we don't touch
>> + * the queue.
>>
>>>
>>>>
>>>> Attached is the updated version of the patch. I didn't change how to
>>>> fix the issue. But I changed the check for fast exit so that it's called
>>>> before setting the "mode", to avoid a few cycle.
>>>>
>>>
>>>
>>> Looks good to me. There is a typo in the comment:
>>>
>>> s/sync_standbys_define/sync_standbys_defined/
>>
>> Fixed. Thanks!
>>
>
> Both v2 and v3 look good to me too. IMO I'm okay with and without the
> last sentence.

Asim and Sawada-san, thanks for the review! I pushed the patch.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-09-02 02:12:05 Re: SyncRepLock acquired exclusively in default configuration
Previous Message Michael Paquier 2020-09-02 01:48:34 Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly