Re: Race condition in SyncRepGetSyncStandbysPriority

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: masahiko(dot)sawada(at)2ndquadrant(dot)com, masao(dot)fujii(at)oss(dot)nttdata(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Race condition in SyncRepGetSyncStandbysPriority
Date: 2020-04-17 05:58:34
Message-ID: 20200417.145834.1847060913969327935.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Thu, 16 Apr 2020 11:39:06 -0400, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote in
> Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> writes:
> > [ syncrep-fixes-4.patch ]
>
> I agree that we could probably improve the clarity of this code with
> further rewriting, but I'm still very opposed to the idea of having
> callers know that the first num_sync array elements are the active
> ones. It's wrong (or at least different from current behavior) for
> quorum mode, where there might be more than num_sync walsenders to
> consider. And it might not generalize very well to other syncrep
> selection rules we might add in future, which might also not have
> exactly num_sync interesting walsenders. So I much prefer an API
> definition that uses bool flags in an array that has no particular
> ordering (so far as the callers know, anyway). If you don't like
> is_sync_standby, how about some more-neutral name like is_active
> or is_interesting or include_position?

I'm convinced that each element has is_sync_standby. I agree to the
name is_sync_standby since I don't come up with a better name.

> I dislike the proposed comment revisions in SyncRepReleaseWaiters,
> too, particularly the change to say that what we're "announcing"
> is the ability to release waiters. You did not change the actual
> log messages, and you would have gotten a lot of pushback if
> you tried, because the current messages make sense to users and
> something like that would not. But by the same token this new
> comment isn't too helpful to somebody reading the code.

The current log messages look perfect to me. I don't insist on the
comment change since I might take the definition of "sync standby" too
strictly.

> (Actually, I wonder why we even have the restriction that only
> sync standbys can release waiters. It's not like they are
> going to get different results from SyncRepGetSyncRecPtr than
> any other walsender would. Maybe we should just drop all the
> am_sync logic?)

I thought the same thing, though I didn't do that in the last patch.

am_sync seems intending to reduce spurious wakeups but actually
spurious wakeup won't increase even without it. Thus the only
remaining task of am_sync is the trigger for the log messages and that
fact is the sign that the log messages should be emitted within
SyncRepGetSyncRecPtr. That eliminates references to SyncRepConfig in
SyncRepReleaseWaiters, which make me feel ease.

The attached is baed on syncrep-fixes-1.patch + am_sync elimination.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
syncrep-fixes-5.patch text/x-patch 25.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kashif Zeeshan 2020-04-17 06:00:55 Re: WIP/PoC for parallel backup
Previous Message Noah Misch 2020-04-17 05:41:46 Re: 001_rep_changes.pl stalls