Re: Prepared transaction releasing locks before deregistering its GID

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Oleksii Kliukin <alexk(at)hintbits(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Prepared transaction releasing locks before deregistering its GID
Date: 2019-02-21 05:54:31
Message-ID: 20190221055431.GO15532@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 20, 2019 at 11:50:42AM +0100, Oleksii Kliukin wrote:
> RecoverPreparedTransactions calls ProcessRecords with
> twophase_recover_callbacks right after releasing the TwoPhaseStateLock, so I
> think lock_held should be false here (matching the similar call of
> TwoPhaseGetDummyProc at lock_twophase_recover).

Indeed. That would be a bad idea. We don't actually stress this
routine in 009_twophase.pl or the assertion I added would have blown
up immediately. So I propose to close the gap, and the updated patch
attached adds dedicated tests causing the problem you spotted to be
triggered (soft and hard restarts with 2PC transactions including
DDLs). The previous version of the patch and those tests cause the
assertion to blow up, failing the tests.

> Since the patch touches TwoPhaseGetDummyBackendId, let’s fix the comment
> header to mention it instead of TwoPhaseGetDummyProc

Not sure I understand the issue you are pointing out here. The patch
adds an extra argument to both TwoPhaseGetDummyProc() and
TwoPhaseGetDummyBackendId(), and the headers of both functions
document the new argument.

One extra trick I have been using for testing this patch is the
following:
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -816,13 +816,6 @@ TwoPhaseGetGXact(TransactionId xid, bool lock_held)

Assert(!lock_held || LWLockHeldByMe(TwoPhaseStateLock));

- /*
- * During a recovery, COMMIT PREPARED, or ABORT PREPARED, we'll be called
- * repeatedly for the same XID. We can save work with a simple cache.
- */
- if (xid == cached_xid)
- return cached_gxact;

This has the advantage to check more aggressively for lock conflicts,
causing the tests to deadlock if the flag is not correctly set in the
worst case scenario.
--
Michael

Attachment Content-Type Size
2pc-commit-locks-v3.patch text/x-diff 9.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tsunakawa, Takayuki 2019-02-21 05:55:50 RE: Timeout parameters
Previous Message Ideriha, Takeshi 2019-02-21 05:41:56 RE: Protect syscache from bloating with negative cache entries