From: | Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
Cc: | masahiko(dot)sawada(at)2ndquadrant(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Race condition in SyncRepGetSyncStandbysPriority |
Date: | 2020-04-20 06:05:46 |
Message-ID: | 1e8dde3b-468a-57d8-4a8b-725994e3f310@oss.nttdata.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2020/04/18 0:31, Tom Lane wrote:
> Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> writes:
>> At Fri, 17 Apr 2020 16:03:34 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
>>> I agree that it might be worth considering the removal of am_sync for
>>> the master branch or v14. But I think that it should not be
>>> back-patched.
>
>> Ah! Agreed.
>
> Yeah, that's not necessary to fix the bug. I'd be inclined to leave
> it for v14 at this point.
>
> I don't much like the patch Fujii-san posted, though. An important part
> of the problem, IMO, is that SyncRepGetSyncStandbysPriority is too
> complicated and it's unclear what dependencies it has on the set of
> priorities in shared memory being consistent. His patch does not improve
> that situation; if anything it makes it worse.
Understood.
>
> If we're concerned about not breaking ABI in the back branches, what
> I propose we do about that is just leave SyncRepGetSyncStandbys in
> place but not used by the core code, and remove it only in HEAD.
> We can do an absolutely minimal fix for the assertion failure, in
> case anybody is calling that code, by just dropping the Assert and
> letting SyncRepGetSyncStandbys return NIL if it falls out. (Or we
> could let it return the incomplete list, which'd be the behavior
> you get today in a non-assert build.)
>
> Also, I realized while re-reading my patch that Kyotaro-san is onto
> something about the is_sync_standby flag not being necessary: instead
> we can just have the new function SyncRepGetCandidateStandbys return
> a reduced count. I'd initially believed that it was necessary for
> that function to return the rejected candidate walsenders along with
> the accepted ones, but that was a misunderstanding. I still don't
> want its API spec to say anything about ordering of the result array,
> but we don't need to.
>
> So that leads me to the attached. I propose applying this to the
> back branches except for the rearrangement of WALSnd field order.
> In HEAD, I'd remove SyncRepGetSyncStandbys and subroutines altogether.
Thanks for making and committing the patch!
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2020-04-20 06:26:16 | Remove non-fast promotion Re: Should we remove a fallback promotion? take 2 |
Previous Message | Amit Kapila | 2020-04-20 06:00:09 | Re: where should I stick that backup? |