Re: Race condition in SyncRepGetSyncStandbysPriority

From: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Race condition in SyncRepGetSyncStandbysPriority
Date: 2020-04-14 04:06:14
Message-ID: CA+fd4k4f+QrGOpEMR3GA6yHakCr6ssJB_EbRrdC-=miwEZf3mQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 14 Apr 2020 at 10:34, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> writes:
> > At Sat, 11 Apr 2020 18:30:30 -0400, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote in
> >> What I think we should do about this is, essentially, to get rid of
> >> SyncRepGetSyncStandbys. Instead, let's have each walsender advertise
> >> whether *it* believes that it is a sync standby, based on its last
> >> evaluation of the relevant GUCs. This would be a bool that it'd
> >> compute and set alongside sync_standby_priority. (Hm, maybe we'd not
>
> > Mmm.. SyncRepGetStandbyPriority returns the "priority" that a
> > walsender thinks it is at, among synchronous_standby_names. Then to
> > decide "I am a sync standby" we need to know how many walsenders with
> > higher priority are alive now. SyncRepGetSyncStandbyPriority does the
> > judgment now and suffers from the inconsistency of priority values.
>
> Yeah. After looking a bit closer, I think that the current definition
> of sync_standby_priority (that is, as the result of local evaluation
> of SyncRepGetStandbyPriority()) is OK. The problem is what we're doing
> with it. I suggest that what we should do in SyncRepGetSyncRecPtr()
> is make one sweep across the WalSnd array, collecting PID,
> sync_standby_priority, *and* the WAL pointers from each valid entry.
> Then examine that data and decide which WAL value we need, without assuming
> that the sync_standby_priority values are necessarily totally consistent.
> But in any case we must examine each entry just once while holding its
> mutex, not go back to it later expecting it to still be the same.

Can we have a similar approach of sync_standby_defined for
sync_standby_priority? That is, checkpionter is responsible for
changing sync_standby_priority of all walsenders when SIGHUP. That
way, all walsenders can see a consistent view of
sync_standby_priority. And when a walsender starts, it sets
sync_standby_priority by itself. The logic to decide who's a sync
standby doesn't change. SyncRepGetSyncRecPtr() gets all walsenders
having higher priority along with their WAL positions.

>
> Another thing that I'm finding interesting is that I now see this is
> not at all new code. It doesn't look like SyncRepGetSyncStandbysPriority
> has changed much since 2016. So how come we didn't detect this problem
> long ago? I searched the buildfarm logs for assertion failures in
> syncrep.c, looking back one year, and here's what I found:
>
> sysname | branch | snapshot | stage | l
> ------------+---------------+---------------------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------
> nightjar | REL_10_STABLE | 2019-08-13 23:04:41 | recoveryCheck | TRAP: FailedAssertion("!(((bool) 0))", File: "/pgbuild/root/REL_10_STABLE/pgsql.build/../pgsql/src/backend/replication/syncrep.c", Line: 940)
> hoverfly | REL9_6_STABLE | 2019-11-07 17:19:12 | recoveryCheck | TRAP: FailedAssertion("!(((bool) 0))", File: "syncrep.c", Line: 723)
> hoverfly | HEAD | 2019-11-22 12:15:08 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c", Line: 951)
> francolin | HEAD | 2020-01-16 23:10:06 | recoveryCheck | TRAP: FailedAssertion("false", File: "/home/andres/build/buildfarm-francolin/HEAD/pgsql.build/../pgsql/src/backend/replication/syncrep.c", Line: 951)
> hoverfly | REL_11_STABLE | 2020-02-29 01:34:55 | recoveryCheck | TRAP: FailedAssertion("!(0)", File: "syncrep.c", Line: 946)
> hoverfly | REL9_6_STABLE | 2020-03-26 13:51:15 | recoveryCheck | TRAP: FailedAssertion("!(((bool) 0))", File: "syncrep.c", Line: 723)
> hoverfly | REL9_6_STABLE | 2020-04-07 21:52:07 | recoveryCheck | TRAP: FailedAssertion("!(((bool) 0))", File: "syncrep.c", Line: 723)
> curculio | HEAD | 2020-04-11 18:30:21 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c", Line: 951)
> sidewinder | HEAD | 2020-04-11 18:45:39 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c", Line: 951)
> curculio | HEAD | 2020-04-11 20:30:26 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c", Line: 951)
> sidewinder | HEAD | 2020-04-11 21:45:48 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c", Line: 951)
> sidewinder | HEAD | 2020-04-13 10:45:35 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c", Line: 951)
> conchuela | HEAD | 2020-04-13 16:00:18 | recoveryCheck | TRAP: FailedAssertion("false", File: "/home/pgbf/buildroot/HEAD/pgsql.build/../pgsql/src/backend/replication/syncrep.c", Line: 951)
> sidewinder | HEAD | 2020-04-13 18:45:34 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c", Line: 951)
> (14 rows)
>
> The line numbers vary in the back branches, but all of these crashes are
> at that same Assert. So (a) yes, this does happen in the back branches,
> but (b) some fairly recent change has made it a whole lot more probable.
> Neither syncrep.c nor 007_sync_rep.pl have changed much in some time,
> so whatever the change was was indirect. Curious. Is it just timing?

Interesting. It's happening on certain animals, not all. Especially
tests with HEAD on sidewinder and curculio, which are NetBSD 7 and
OpenBSD 5.9 respectively, started to fail at a high rate since a
couple of days ago.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2020-04-14 04:08:29 Re: variation of row_number with parallel
Previous Message Rajkumar Raghuwanshi 2020-04-14 03:58:50 variation of row_number with parallel