From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
Cc: | wangchuanting <wangchuanting(at)huawei(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: Re: BUG #14680: startup process on standby encounter a deadlock of TwoPhaseStateLock when redo 2PC xlog |
Date: | 2017-06-13 23:08:12 |
Message-ID: | 20170613230812.uvlhjlvfbad7njb7@alvherre.pgsql |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
Michael Paquier wrote:
> The current coding is actually safe because the checkpointer does not
> remove or add any 2PC entry in the array while holding
> TwoPhaseStateLock, it just updates some values that need to be read
> and/or written while holding the lock. Well, to be honest, HEAD is
> wrong because it can read a flag value while the checkpointer updates
> it, and the patch is careful to change that to be correct. The wrong
> part is when calling ProcessTwoPhaseBuffer() in
> RecoverPreparedTransactions() which accesses gxact->ondisk and
> prepare_start_lsn without locking things.
Honestly I don't like this rationale very much. Even if doing the
unlocked access is safe today, it looks like installing a landmine for
the future, and for what? I don't think there's a lot to be gained:
RecoverPreparedTransactions only runs once in the life of a server, and
CheckPointTwoPhase is supposed to have a very short runtime (per
explanation in comments therein). It seems better to me to continue our
tradition of using the appropriate locks instead of playing a dangerous
game.
So I propose that RecoverPreparedTransactions grabs exclusive lock at
the top, and only the bottom part of the loop is done unlocked, which
AFAICS should be safe. (MarkAsPrepared gained a boolean argument
indicating that caller already holds lock).
Here's a patch along those lines. The full testsuite is running now,
but the recovery tests pass fine.
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment | Content-Type | Size |
---|---|---|
2pc-redo-lwlock-fix-v6.patch | text/plain | 13.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2017-06-14 01:03:11 | Re: Re: BUG #14680: startup process on standby encounter a deadlock of TwoPhaseStateLock when redo 2PC xlog |
Previous Message | Michael Paquier | 2017-06-13 20:38:52 | Re: [BUGS] Invalid WAL segment size. Allowed values are 1,2,4,8,16,32,64 |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2017-06-14 01:03:11 | Re: Re: BUG #14680: startup process on standby encounter a deadlock of TwoPhaseStateLock when redo 2PC xlog |
Previous Message | Stephen Frost | 2017-06-13 22:29:20 | Re: WIP: Data at rest encryption |