Re: logical decoding of two-phase transactions

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: logical decoding of two-phase transactions
Date: 2017-04-04 01:23:23
Message-ID: CAD21AoDDC7USpUyUim=7BwuFByauRGfLzkE7hhiV=jjDckc+KA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 30, 2017 at 12:55 AM, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru> wrote:
>
>> On 28 Mar 2017, at 18:08, Andres Freund <andres(at)anarazel(dot)de> wrote:
>>
>> On 2017-03-28 15:55:15 +0100, Simon Riggs wrote:
>>>
>>>
>>> That assertion is obviously false... the plugin can resolve this in
>>> various ways, if we allow it.
>>
>> Handling it by breaking replication isn't handling it (e.g. timeouts in
>> decoding etc). Handling it by rolling back *prepared* transactions
>> (which are supposed to be guaranteed to succeed!), isn't either.
>>
>>
>>> You can say that in your opinion you prefer to see this handled in
>>> some higher level way, though it would be good to hear why and how.
>>
>> It's pretty obvious why: A bit of DDL by the user shouldn't lead to the
>> issues mentioned above.
>>
>>
>>> Bottom line here is we shouldn't reject this patch on this point,
>>
>> I think it definitely has to be rejected because of that. And I didn't
>> bring this up at the last minute, I repeatedly brought it up before.
>> Both to Craig and Stas.
>
> Okay. In order to find more realistic cases that blocks replication
> i’ve created following setup:
>
> * in backend: tests_decoding plugins hooks on xact events and utility
> statement hooks and transform each commit into prepare, then sleeps
> on latch. If transaction contains DDL that whole statement pushed in
> wal as transactional message. If DDL can not be prepared or disallows
> execution in transaction block than it goes as nontransactional logical
> message without prepare/decode injection. If transaction didn’t issued any
> DDL and didn’t write anything to wal, then it skips 2pc too.
>
> * after prepare is decoded, output plugin in walsender unlocks backend
> allowing to proceed with commit prepared. So in case when decoding
> tries to access blocked catalog everything should stop.
>
> * small python script that consumes decoded wal from walsender (thanks
> Craig and Petr)
>
> After small acrobatics with that hooks I’ve managed to run whole
> regression suite in parallel mode through such setup of test_decoding
> without any deadlocks. I’ve added two xact_events to postgres and
> allowedn prepare of transactions that touched temp tables since
> they are heavily used in tests and creates a lot of noise in diffs.
>
> So it boils down to 3 failed regression tests out of 177, namely:
>
> * transactions.sql — here commit of tx stucks with obtaining SafeSnapshot().
> I didn’t look what is happening there specifically, but just checked that
> walsender isn’t blocked. I’m going to look more closely at this.
>
> * prepared_xacts.sql — here select prepared_xacts() sees our prepared
> tx. It is possible to filter them out, but obviously it works as expected.
>
> * guc.sql — here pendingActions arrives on 'DISCARD ALL’ preventing tx
> from being prepared. I didn’t found the way to check presence of
> pendingActions outside of async.c so decided to leave it as is.
>
> It seems that at least in regression tests nothing can block twophase
> logical decoding. Is that strong enough argument to hypothesis that current
> approach doesn’t creates deadlock except locks on catalog which should be
> disallowed anyway?
>
> Patches attached. logical_twophase_v5 is slightly modified version of previous
> patch merged with Craig’s changes. Second file is set of patches over previous
> one, that implements logic i’ve just described. There is runtest.sh script that
> setups postgres, runs python logical consumer in background and starts
> regression test.
>
>

I reviewed this patch but when I tried to build contrib/test_decoding
I got the following error.

$ make
gcc -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -g -g -fpic -I. -I. -I../../src/include -D_GNU_SOURCE -c -o
test_decoding.o test_decoding.c -MMD -MP -MF .deps/test_decoding.Po
test_decoding.c: In function '_PG_init':
test_decoding.c:126: warning: assignment from incompatible pointer type
test_decoding.c: In function 'test_decoding_process_utility':
test_decoding.c:271: warning: passing argument 5 of
'PreviousProcessUtilityHook' from incompatible pointer type
test_decoding.c:271: note: expected 'struct QueryEnvironment *' but
argument is of type 'struct DestReceiver *'
test_decoding.c:271: warning: passing argument 6 of
'PreviousProcessUtilityHook' from incompatible pointer type
test_decoding.c:271: note: expected 'struct DestReceiver *' but
argument is of type 'char *'
test_decoding.c:271: error: too few arguments to function
'PreviousProcessUtilityHook'
test_decoding.c:276: warning: passing argument 5 of
'standard_ProcessUtility' from incompatible pointer type
../../src/include/tcop/utility.h:38: note: expected 'struct
QueryEnvironment *' but argument is of type 'struct DestReceiver *'
test_decoding.c:276: warning: passing argument 6 of
'standard_ProcessUtility' from incompatible pointer type
../../src/include/tcop/utility.h:38: note: expected 'struct
DestReceiver *' but argument is of type 'char *'
test_decoding.c:276: error: too few arguments to function
'standard_ProcessUtility'
test_decoding.c: At top level:
test_decoding.c:285: warning: 'test_decoding_twophase_commit' was used
with no prototype before its definition
make: *** [test_decoding.o] Error 1

---
After applied both patches the regression test 'make check' failed. I
think you should update expected/transactions.out file as well.

$ cat src/test/regress/regression.diffs
*** /home/masahiko/pgsql/source/postgresql/src/test/regress/expected/transactions.out
Mon May 2 09:16:02 2016
--- /home/masahiko/pgsql/source/postgresql/src/test/regress/results/transactions.out
Tue Apr 4 09:52:44 2017
***************
*** 43,58 ****
-- Read-only tests
CREATE TABLE writetest (a int);
CREATE TEMPORARY TABLE temptest (a int);
! BEGIN;
! SET TRANSACTION ISOLATION LEVEL SERIALIZABLE, READ ONLY, DEFERRABLE; -- ok
! SELECT * FROM writetest; -- ok
! a
! ---
! (0 rows)
!
! SET TRANSACTION READ WRITE; --fail
! ERROR: transaction read-write mode must be set before any query
! COMMIT;
BEGIN;
SET TRANSACTION READ ONLY; -- ok
SET TRANSACTION READ WRITE; -- ok
--- 43,53 ----
-- Read-only tests
CREATE TABLE writetest (a int);
CREATE TEMPORARY TABLE temptest (a int);
! -- BEGIN;
! -- SET TRANSACTION ISOLATION LEVEL SERIALIZABLE, READ ONLY, DEFERRABLE; -- ok
! -- SELECT * FROM writetest; -- ok
! -- SET TRANSACTION READ WRITE; --fail
! -- COMMIT;
BEGIN;
SET TRANSACTION READ ONLY; -- ok
SET TRANSACTION READ WRITE; -- ok

======================================================================
There are still some unnecessary code in v5 patch.

---
+/* PREPARE callback */
+static void
+pg_decode_prepare_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
+ XLogRecPtr prepare_lsn)
+{
+ TestDecodingData *data = ctx->output_plugin_private;
+ int backend_procno;
+
+ // if (data->skip_empty_xacts && !data->xact_wrote_changes)
+ // return;
+
+ OutputPluginPrepareWrite(ctx, true);
+

Could you please update these patches?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vinayak 2017-04-04 01:27:23 Re: ANALYZE command progress checker
Previous Message Amit Langote 2017-04-04 01:19:31 Remove pg_stat_progress_vacuum from Table 28.2