Re: Re: BUG #14680: startup process on standby encounter a deadlock of TwoPhaseStateLock when redo 2PC xlog

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: wangchuanting <wangchuanting(at)huawei(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Re: BUG #14680: startup process on standby encounter a deadlock of TwoPhaseStateLock when redo 2PC xlog
Date: 2017-06-12 23:39:05
Message-ID: CAB7nPqSZwUbHX6jejbVbeB1LfrKHFf5giuTLjFXTXXtq8LZcxQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Tue, Jun 13, 2017 at 7:49 AM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> However, I found out that this rationale is likely not true, because the
> checkpointer may be running concurrently with this code from startup
> process, and checkpointer does process 2PC data. Maybe there are other
> reasons why there's no live bug here, but it looks wrong (I didn't try
> to reproduce a problem).

The current coding is actually safe because the checkpointer does not
remove or add any 2PC entry in the array while holding
TwoPhaseStateLock, it just updates some values that need to be read
and/or written while holding the lock. Well, to be honest, HEAD is
wrong because it can read a flag value while the checkpointer updates
it, and the patch is careful to change that to be correct. The wrong
part is when calling ProcessTwoPhaseBuffer() in
RecoverPreparedTransactions() which accesses gxact->ondisk and
prepare_start_lsn without locking things.
--
Michael

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message asah 2017-06-13 00:51:32 BUG #14703: documentation bug:
Previous Message Alvaro Herrera 2017-06-12 22:49:25 Re: Re: BUG #14680: startup process on standby encounter a deadlock of TwoPhaseStateLock when redo 2PC xlog

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-06-13 00:32:28 Re: PostgreSQL 10 changes in exclusion constraints - did something change? CASE WHEN behavior oddity
Previous Message Tom Lane 2017-06-12 23:19:34 Re: Relpartbound, toasting and pg_class