Re: Race condition in SyncRepGetSyncStandbysPriority

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: masahiko(dot)sawada(at)2ndquadrant(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Race condition in SyncRepGetSyncStandbysPriority
Date: 2020-04-17 07:03:34
Message-ID: 3f58063d-2544-aea8-07aa-7e0c35484187@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/04/17 14:58, Kyotaro Horiguchi wrote:
> At Thu, 16 Apr 2020 11:39:06 -0400, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote in
>> Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> writes:
>>> [ syncrep-fixes-4.patch ]
>>
>> I agree that we could probably improve the clarity of this code with
>> further rewriting, but I'm still very opposed to the idea of having
>> callers know that the first num_sync array elements are the active
>> ones. It's wrong (or at least different from current behavior) for
>> quorum mode, where there might be more than num_sync walsenders to
>> consider. And it might not generalize very well to other syncrep
>> selection rules we might add in future, which might also not have
>> exactly num_sync interesting walsenders. So I much prefer an API
>> definition that uses bool flags in an array that has no particular
>> ordering (so far as the callers know, anyway). If you don't like
>> is_sync_standby, how about some more-neutral name like is_active
>> or is_interesting or include_position?
>
> I'm convinced that each element has is_sync_standby. I agree to the
> name is_sync_standby since I don't come up with a better name.
>
>> I dislike the proposed comment revisions in SyncRepReleaseWaiters,
>> too, particularly the change to say that what we're "announcing"
>> is the ability to release waiters. You did not change the actual
>> log messages, and you would have gotten a lot of pushback if
>> you tried, because the current messages make sense to users and
>> something like that would not. But by the same token this new
>> comment isn't too helpful to somebody reading the code.
>
> The current log messages look perfect to me. I don't insist on the
> comment change since I might take the definition of "sync standby" too
> strictly.
>
>> (Actually, I wonder why we even have the restriction that only
>> sync standbys can release waiters. It's not like they are
>> going to get different results from SyncRepGetSyncRecPtr than
>> any other walsender would. Maybe we should just drop all the
>> am_sync logic?)
>
> I thought the same thing, though I didn't do that in the last patch.
>
> am_sync seems intending to reduce spurious wakeups but actually
> spurious wakeup won't increase even without it. Thus the only
> remaining task of am_sync is the trigger for the log messages and that
> fact is the sign that the log messages should be emitted within
> SyncRepGetSyncRecPtr. That eliminates references to SyncRepConfig in
> SyncRepReleaseWaiters, which make me feel ease.
>
> The attached is baed on syncrep-fixes-1.patch + am_sync elimination.

I agree that it might be worth considering the removal of am_sync for
the master branch or v14. But I think that it should not be back-patched.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2020-04-17 07:07:54 Re: Parallel Append can break run-time partition pruning
Previous Message Noah Misch 2020-04-17 06:56:49 Re: cleaning perl code