Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality)

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jan Wieck <jan(at)wi3ck(dot)info>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality)
Date: 2014-05-06 11:44:14
Message-ID: 20140506114414.GN17909@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2014-05-05 13:41:00 +0300, Heikki Linnakangas wrote:
> I came up with the attached fix for this. Currently, the entry is implicitly
> considered dead or unlocked if the locking_xid transaction is no longer
> active, but this patch essentially turns locking_xid into a simple boolean,
> and makes it the backend's responsibility to clear it on abort. (it's not
> actually a boolean, it's a BackendId, but that's just for debugging purposes
> to track who's keeping the entry locked). This requires a process exit hook,
> and an abort hook, to make sure the entry is always released, but that's not
> too difficult. It allows the backend to release the entry at exactly the
> right time, instead of having it implicitly released by

> I considered Andres' idea of using a new heavy-weight lock, but didn't like
> it much. It would be a larger patch, which is not nice for back-patching.
> One issue would be that if you run out of lock memory, you could not roll
> back any prepared transactions, which is not nice because it could be a
> prepared transaction that's hoarding the lock memory.

I am not convinced by the latter reasoning but you're right that any
such change would hardly be backpatchable.

> +/*
> + * Exit hook to unlock the global transaction entry we're working on.
> + */
> +static void
> +AtProcExit_Twophase(int code, Datum arg)
> +{
> + /* same logic as abort */
> + AtAbort_Twophase();
> +}
> +
> +/*
> + * Abort hook to unlock the global transaction entry we're working on.
> + */
> +void
> +AtAbort_Twophase(void)
> +{
> + if (MyLockedGxact == NULL)
> + return;
> +
> + /*
> + * If we were in process of preparing the transaction, but haven't
> + * written the WAL record yet, remove the global transaction entry.
> + * Same if we are in the process of finishing an already-prepared
> + * transaction, and fail after having already written the WAL 2nd
> + * phase commit or rollback record.
> + *
> + * After that it's too late to abort, so just unlock the GlobalTransaction
> + * entry. We might not have transfered all locks and other state to the
> + * prepared transaction yet, so this is a bit bogus, but it's the best we
> + * can do.
> + */
> + if (!MyLockedGxact->valid)
> + {
> + RemoveGXact(MyLockedGxact);
> + }
> + else
> + {
> + LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
> +
> + MyLockedGxact->locking_backend = InvalidBackendId;
> +
> + LWLockRelease(TwoPhaseStateLock);
> + }
> + MyLockedGxact = NULL;
> +}

Is it guaranteed that all paths have called LWLockReleaseAll()
before calling the proc exit hooks? Otherwise we might end up waiting
for ourselves...

> /*
> * MarkAsPreparing
> @@ -261,29 +329,15 @@ MarkAsPreparing(TransactionId xid, const char *gid,
> errmsg("prepared transactions are disabled"),
> errhint("Set max_prepared_transactions to a nonzero value.")));
>
> - LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
> -
> - /*
> - * First, find and recycle any gxacts that failed during prepare. We do
> - * this partly to ensure we don't mistakenly say their GIDs are still
> - * reserved, and partly so we don't fail on out-of-slots unnecessarily.
> - */
> - for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
> + /* on first call, register the exit hook */
> + if (!twophaseExitRegistered)
> {
> - gxact = TwoPhaseState->prepXacts[i];
> - if (!gxact->valid && !TransactionIdIsActive(gxact->locking_xid))
> - {
> - /* It's dead Jim ... remove from the active array */
> - TwoPhaseState->numPrepXacts--;
> - TwoPhaseState->prepXacts[i] = TwoPhaseState->prepXacts[TwoPhaseState->numPrepXacts];
> - /* and put it back in the freelist */
> - gxact->next = TwoPhaseState->freeGXacts;
> - TwoPhaseState->freeGXacts = gxact;
> - /* Back up index count too, so we don't miss scanning one */
> - i--;
> - }
> + before_shmem_exit(AtProcExit_Twophase, 0);
> + twophaseExitRegistered = true;
> }

It's not particularly nice to register shmem exit hooks in the middle of
normal processing because it makes it impossible to use
cancel_before_shmem_exit() previously registered hooks. I think this
should be registered at startup, if max_prepared_xacts > 0.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2014-05-06 11:45:43 Re: pg_shmem_allocations view
Previous Message Michael Meskes 2014-05-06 11:31:46 Re: using array of char pointers gives wrong results