Re: logical decoding of two-phase transactions

From: Sokolov Yura <y(dot)sokolov(at)postgrespro(dot)ru>
To: Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>
Cc: Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: logical decoding of two-phase transactions
Date: 2017-10-26 19:01:40
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2017-09-27 14:46, Stas Kelvich wrote:
>> On 7 Sep 2017, at 18:58, Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com>
>> wrote:
>> Hi,
>> FYI all, wanted to mention that I am working on an updated version of
>> the latest patch that I plan to submit to a later CF.
> Cool!
> So what kind of architecture do you have in mind? Same way as is it
> was implemented before?
> As far as I remember there were two main issues:
> * Decodong of aborted prepared transaction.
> If such transaction modified catalog then we can’t read reliable info
> with our historic snapshot,
> since clog already have aborted bit for our tx it will brake
> visibility logic. There are some way to
> deal with that — by doing catalog seq scan two times and counting
> number of tuples (details
> upthread) or by hijacking clog values in historic visibility function.
> But ISTM it is better not solve this
> issue at all =) In most cases intended usage of decoding of 2PC
> transaction is to do some form
> of distributed commit, so naturally decoding will happens only with
> in-progress transactions and
> we commit/abort will happen only after it is decoded, sent and
> response is received. So we can
> just have atomic flag that prevents commit/abort of tx currently being
> decoded. And we can filter
> interesting prepared transactions based on GID, to prevent holding
> this lock for ordinary 2pc.
> * Possible deadlocks that Andres was talking about.
> I spend some time trying to find that, but didn’t find any. If locking
> pg_class in prepared tx is the only
> example then (imho) it is better to just forbid to prepare such
> transactions. Otherwise if some realistic
> examples that can block decoding are actually exist, then we probably
> need to reconsider the way
> tx being decoded. Anyway this part probably need Andres blessing.

Just rebased patch logical_twophase_v6 to master.

Fixed small issues:
- XactLogAbortRecord wrote DBINFO twice, but it was decoded in
ParseAbortRecord only once. Second DBINFO were parsed as ORIGIN.
Fixed by removing second write of DBINFO.
- SnapBuildPrepareTxnFinish tried to remove xid from `running` instead
of `committed`. And it removed only xid, without subxids.
- test_decoding skipped returning "COMMIT PREPARED" and "ABORT

Big issue were with decoding ddl-including two-phase transactions:
- prepared.out were misleading. We could not reproduce decoding body of
"test_prepared#3" with logical_twophase_v6.diff. It was skipped if
`pg_logical_slot_get_changes` were called without
`twophase-decode-with-catalog-changes` set, and only "COMMIT PREPARED
test_prepared#3" were decoded.
The reason is "PREPARE TRANSACTION" is passed to `pg_filter_prepare`
- first on "PREPARE" itself,
- second - on "COMMIT PREPARED".
In v6, `pg_filter_prepare` without `with-catalog-changes` first time
answered "true" (ie it should not be decoded), and second time (when
transaction became committed) it answered "false" (ie it should be
decoded). But second time in DecodePrepare
is already in a future compared to `buf->origptr` (because it is
on "COMMIT PREPARED" lsn). Therefore DecodePrepare just called
If `pg_filter_prepare` is called with `with-catalog-changes`, then
it returns "false" both times, thus DeocdePrepare decodes transaction
in first time, and calls `ReorderBufferForget` in second time.

I didn't found a way to fix it gracefully. I just change
to return same answer both times: "false" if called
(ie need to call DecodePrepare), and "true" otherwise. With this
change, catalog changing two-phase transaction is decoded as simple
one-phase transaction, if `pg_logical_slot_get_changes` is called
without `with-catalog-changes`.

With regards,
Sokolov Yura
Postgres Professional:
The Russian Postgres Company

Attachment Content-Type Size
logical_twophase_v7.diff.gz application/x-gzip 14.4 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2017-10-26 19:19:18 Re: Removing [Merge]Append nodes which contain a single subpath
Previous Message Michael Paquier 2017-10-26 18:13:55 Re: [BUGS] Bug in Physical Replication Slots (at least 9.5)?