Re: SSI and 2PC

From: Florian Pflug <fgp(at)phlo(dot)org>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: <pgsql-hackers(at)postgresql(dot)org>, "Dan Ports" <drkp(at)csail(dot)mit(dot)edu>, "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "Jeff Davis" <pgsql(at)j-davis(dot)com>
Subject: Re: SSI and 2PC
Date: 2011-01-11 18:08:07
Message-ID: A05653FA-5F08-44CC-B53F-62DC1E9A61E0@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Jan10, 2011, at 18:50 , Kevin Grittner wrote:
> I'm trying not to panic here, but I haven't looked at 2PC before
> yesterday and am just dipping into the code to support it, and time
> is short. Can anyone give me a pointer to anything I should read
> before I dig through the 2PC code, which might accelerate this?

It roughly works as follows

Upon PREPARE, the locks previously held by the transaction are transferred
to a kind of virtual backend which only consists of a special proc array
entry. The transaction will thus still appear to be running, and will still
be holding its locks, even after the original backend is gone. The information
necessary to reconstruct that proc array entry is also written to the 2PC state,
and used to recreate the "virtual backend" after a restart or crash.

There are also some additional pieces of transaction state which are stored
in the 2PC state file like the full list of subtransaction xids (The proc array
entry may not contain all of them if it overflowed).

Upon COMMIT PREPARED, the information in the 2PC state file is used to write
a COMMIT wal record and to update the clog. The transaction is then committed,
and the special proc array entry is removed and all lockmgr locks it held are
released.

For 2PC to work for SSI transaction, I guess you must check for conflicts
during PREPARE - at any later point the COMMIT may only fail transiently,
not permanently. Any transaction that adds a conflict with an already
prepared transaction must check if that conflict completes a dangerous
structure, and abort if this is the case, since the already PREPAREd transaction
can no longer be aborted. COMMIT PREPARED then probably doesn't need to do
anything special for SSI transactions, apart from some cleanup actions maybe.

Unfortunately, it seems that doing things this way will undermine the guarantee
that retrying a failed SSI transaction won't fail due to the same conflict as
it did originally. Consider

T1> BEGIN TRANSACTION ISOLATION SERIALIZABLE
T1> SELECT * FROM T
T1> UPDATE T ...
T1> PREPARE TRANSACTION

T2> BEGIN TRANSACTION ISOLATION SERIALIZABLE
T2> SELECT * FROM T
T2> UPDATE T ...
-> Serialization Error

Retrying T2 won't help as long as T1 isn't COMMITTED.

There doesn't seems a way around that, however - any correct implementation
of 2PC for SSI will have to behave that way I fear :-(

Hope this helps & best regards,
Florian Pflug

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2011-01-11 18:17:20 Allowing multiple concurrent base backups
Previous Message Tom Lane 2011-01-11 17:59:36 Re: autogenerating error code lists (was Re: [COMMITTERS] pgsql: Add foreign data wrapper error code values for SQL/MED.)