Re: [HACKERS] logical decoding of two-phase transactions

From: Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Sokolov Yura <y(dot)sokolov(at)postgrespro(dot)ru>, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Date: 2018-04-04 09:53:37
Views: Raw Message | Whole Thread | Download mbox
Lists: pgsql-hackers

> This is due to the new ERROR handling code that I added today for the
> lock/unlock APIs. Will fix.

Fixed. I continue to test this area for other issues.

>>> Also, current value for LOGICALREP_IS_COMMIT is 1, but previous code expected
> flags to be zero, so this way logical replication between postgres-10 and
> postgres-with-2pc-decoding will be broken.
> Good point. Will also test pg-10 to pg-11 logical replication to
> ensure that there are no issues.

I started making changes for supporting replication between
postgres-10 and postgres-11 but then very quickly realized that
pgoutput support is too far from being done. It needs to be optional
and per subscription. It definitely needs proto version bump and we
don't even have a framework for negotiating proto version yet (since
the proto was never bumped) so there is a chunk of completely new code
missing. For demo and functionality purposes we have test_decoding
support for 2pc decoding in this patch set. External plugins like bdr
and pglogical will be able to leverage this infrastructure as well.

Importantly, since we don't do negotiation then PG10 -> PG11
replication is not possible making one of the most important current
use cases not possible. To add support in pgoutput, we'd first have to
get multi-protocol publisher/subscriber communication working as a
pre-requisite. The good thing is that once we get the proto stuff in,
we can easily add the patch from the earlier patchset which provides
full 2PC decoding support in pgoutput.

So, we should consider not adding pgoutput support right away and I
have removed that patch from this patchset now. Another aspect of not
working on pgoutput is we need not worry about adding an
enable_twophase option to CREATE SUBSCRIPTION immediately as well. The
test_decoding plugin is easy to extend with options and the patch set
already does that for enabling/disabling 2PC decoding in it.

>>> So I think we need a subscription parameter to enable/disable this,
> defaulting to 'disabled'.
> Ok, will add it to the "CREATE SUBSCRIPTION", btw, we should have
> allowed storing options in an array form for a subscription. We might
> add more options in the future and adding fields one by one doesn't
> seem that extensible.

This is not needed since we should not look at pgoutput 2PC decode support now.

>> 1) twophase.c
>> ---------
>> I think this comment is slightly inaccurate:
>> /*
>> * Coordinate with logical decoding backends that may be already
>> * decoding this prepared transaction. When aborting a transaction,
>> * we need to wait for all of them to leave the decoding group. If
>> * committing, we simply remove all members from the group.
>> */
>> Strictly speaking, we're not waiting for the workers to leave the
>> decoding group, but to set decodeLocked=false. That is, we may proceed
>> when there still are members, but they must be in unlocked state.
> Agreed. Will modify it to mention that it will wait only if some of
> the backends are in locked state.

Modified the comment.

>> 2) reorderbuffer.c
>> ------------------
>> I've already said it before, I find the "flags" bitmask and rbtxn_*
>> macros way less readable than individual boolean flags. It was claimed
>> this was done on Andres' request, but I don't see that in the thread. I
>> admit it's rather subjective, though.
> Yeah, this is a little subjective.

If the committer has strong opinions on this, then I can whip up
patches along desired lines.

>> I see ReorederBuffer only does the lock/unlock around apply_change and
>> RelationIdGetRelation. That seems insufficient - RelidByRelfilenode can
>> do heap_open on pg_class, for example. And I guess we need to protect
>> rb->message too, because who knows what the plugin does in the callback?

Added lock/unlock APIs around rb->message and other places where
Relations are fetched.

>> Also, we should not allocate gid[GIDSIZE] for every transaction. For
>> example subxacts never need it, and it seems rather wasteful to allocate
>> 200B when the rest of the struct has only ~100B. This is particularly
>> problematic considering ReorderBufferTXN is not spilled to disk when
>> reaching the memory limit. It needs to be allocated ad-hoc only when
>> actually needed.
> OK, will look at allocating GID only when needed.
Done. Now GID is a char pointer and gets palloc'ed and pfree'd.

PFA, latest patchset.

Nikhil Sontakke
PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Cleaning-up-of-flags-in-ReorderBufferTXN-structure.0404.patch application/octet-stream 7.3 KB
0002-Introduce-LogicalLockTransaction-LogicalUnlockTransa.0404.patch application/octet-stream 32.4 KB
0003-Support-decoding-of-two-phase-transactions-at-PREPAR.0404.patch application/octet-stream 43.3 KB
0004-Teach-test_decoding-plugin-to-work-with-2PC.0404.patch application/octet-stream 25.3 KB
0005-Optional-Additional-test-case-to-demonstrate-decoding-rollbac.0404.patch application/octet-stream 4.1 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2018-04-04 10:31:09 Re: [HACKERS] Add support for tuple routing to foreign partitions
Previous Message Teodor Sigaev 2018-04-04 09:52:11 Re: json(b)_to_tsvector with numeric values