Re: logical decoding of two-phase transactions

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: logical decoding of two-phase transactions
Date: 2017-01-26 23:52:45
Message-ID: CAMsr+YHos95GFyfd2Gk61-66wpxFTgXJh=sBV3Rtfhe3xzG1HQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 26 January 2017 at 19:34, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru> wrote:

> Imagine following scenario:
>
> 1. PREPARE happend
> 2. PREPARE decoded and sent where it should be sent
> 3. We got all responses from participating nodes and issuing COMMIT/ABORT
> 4. COMMIT/ABORT decoded and sent
>
> After step 3 there is no more memory state associated with that prepared tx, so if will fail
> between 3 and 4 then we can’t know GID unless we wrote it commit record (or table).

If the decoding session crashes/disconnects and restarts between 3 and
4, we know the xact is now committed or rolled backand we don't care
about its gid anymore, we can decode it as a normal committed xact or
skip over it if aborted. If Pg crashes between 3 and 4 the same
applies, since all decoding sessions must restart.

No decoding session can ever start up between 3 and 4 without passing
through 1 and 2, since we always restart decoding at restart_lsn and
restart_lsn cannot be advanced past the assignment (BEGIN) of a given
xid until we pass its commit record and the downstream confirms it has
flushed the results.

The reorder buffer doesn't even really need to keep track of the gid
between 3 and 4, though it should do to save the output plugin and
downstream the hassle of keeping an xid to gid mapping. All it needs
is to know if we sent a given xact's data to the output plugin at
PREPARE time, so we can suppress sending them again at COMMIT time,
and we can store that info on the ReorderBufferTxn. We can store the
gid there too.

We'll need two new output plugin callbacks

prepare_cb
rollback_cb

since an xact can roll back after we decode PREPARE TRANSACTION (or
during it, even) and we have to be able to tell the downstream to
throw the data away.

I don't think the rollback callback should be called
abort_prepared_cb, because we'll later want to add the ability to
decode interleaved xacts' changes as they are made, before commit, and
in that case will also need to know if they abort. We won't care if
they were prepared xacts or not, but we'll know based on the
ReorderBufferTXN anyway.

We don't need a separate commit_prepared_cb, the existing commit_cb is
sufficient. The gid will be accessible on the ReorderBufferTXN.

Now, if it's simpler to just xlog the gid at COMMIT PREPARED time when
wal_level >= logical I don't think that's the end of the world. But
since we already have almost everything we need in memory, why not
just stash the gid on ReorderBufferTXN?

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-01-26 23:53:52 Re: Speedup twophase transactions
Previous Message Tom Lane 2017-01-26 23:39:09 Re: Performance improvement for joins where outer side is unique