Re: logical decoding of two-phase transactions

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: logical decoding of two-phase transactions
Date: 2017-03-28 02:53:55
Message-ID: CAMsr+YF9ya3PsyWekqBqaeRz9WC+roGWGY5qDso0Jx6O1ajAHQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On 28 March 2017 at 09:25, Andres Freund <andres(at)anarazel(dot)de> wrote:

> If you actually need separate decoding of 2PC, then you want to wait for
> the PREPARE to be replicated. If that replication has to wait for the
> to-be-replicated prepared transaction to commit prepared, and commit
> prepare will only happen once replication happened...

In other words, the output plugin cannot decode a transaction at
PREPARE TRANSACTION time if that xact holds an AccessExclusiveLock on
a catalog relation we must be able to read in order to decode the
xact.

>> Is there any other scenarios where catalog readers are blocked except explicit lock
>> on catalog table? Alters on catalogs seems to be prohibited.
>
> VACUUM FULL on catalog tables (but that can't happen in xact => 2pc)
> CLUSTER on catalog tables (can happen in xact)
> ALTER on tables modified in the same transaction (even of non catalog
> tables!), because a lot of routines will do a heap_open() to get the
> tupledesc etc.

Right, and the latter one is the main issue, since it's by far the
most likely and hard to just work around.

The tests Stas has in place aren't sufficient to cover this, as they
decode only after everything has committed. I'm expanding the
pg_regress coverage to do decoding between prepare and commit (when we
actually care) first, and will add some tests involving strong locks.
I've found one bug where it doesn't decode a 2pc xact at prepare or
commit time, even without restart or strong lock issues. Pretty sure
it's due to assumptions made about the filter callback.

The current code as used by test_decoding won't work correctly. If
txn->has_catalog_changes and if it's still in-progress, the filter
skips decoding at PREPARE time. But it isn't then decoded at COMMIT
PREPARED time either, if we processed past the PREPARE TRANSACTION.
Bug.

Also, by skipping decoding of 2pc xacts with catalog changes in this
test we also hide the locking issues.

However, even once I add an option to force decoding of 2pc xacts with
catalog changes to test_decoding, I cannot reproduce the expected
locking issues so far. See tests in attached updated version, in
contrib/test_decoding/sql/prepare.sql .

Haven't done any TAP tests yet, since the pg_regress tests are so far
sufficient to turn up issues.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
logical_twophase_v4.patch text/x-patch 67.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2017-03-28 02:56:49 Re: Patch: Write Amplification Reduction Method (WARM)
Previous Message Michael Paquier 2017-03-28 02:52:05 Re: Crash on promotion when recovery.conf is renamed