Re: BUG #14721: Assertion of synchronous replication

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: const_sunny(at)126(dot)com
Cc: PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #14721: Assertion of synchronous replication
Date: 2017-06-29 05:11:47
Message-ID: CAEepm=33AAJFu9CMPhCsnX-Zg6ZySrga_oAjRV0Dtgx2G03kpQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Jun 29, 2017 at 4:27 PM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> /*
> * Acquiring the lock is not needed, the latch ensures proper
> * barriers. If it looks like we're done, we must really be done,
> * because once walsender changes the state to SYNC_REP_WAIT_COMPLETE,
> * it will never update it again, so we can't be seeing a stale value
> * in that case.
> */

Yeah, counting on the latch for free barriers doesn't work if you
happen to see SYNC_REP_WAIT_COMPLETE first time through the loop, or
if you see it after a spurious signal woke you and then it's
immediately set to SYNC_REP_WAIT_COMPLETE. In those cases, the
following Assert statement is making an assertion about cache
coherency that doesn't work even on a friendly TSO system.

Can you reproduce the problem with this experimental patch applied?

--
Thomas Munro
http://www.enterprisedb.com

Attachment Content-Type Size
barriers.patch application/octet-stream 1.2 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michal Novotny 2017-06-29 09:01:17 Segmentation fault in libpq
Previous Message const_sunny@126.com 2017-06-29 04:30:09 Assertion of synchronous replication