Re: [HACKERS] logical decoding of two-phase transactions

From: Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>
To: Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Sokolov Yura <y(dot)sokolov(at)postgrespro(dot)ru>, Andres Freund <andres(at)anarazel(dot)de>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Date: 2018-02-08 12:43:23
Message-ID: 11AEF68F-14A2-4662-AAEE-87422ADA3864@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

Thanks for working on this patch.

Reading through patch I’ve noticed that you deleted call to SnapBuildCommitTxn()
in DecodePrepare(). As you correctly spotted upthread there was unnecessary
code that marked transaction as running after decoding of prepare. However call
marking it as committed before decoding of prepare IMHO is still needed as
SnapBuildCommitTxn does some useful thing like setting base snapshot for parent
transactions which were skipped because of SnapBuildXactNeedsSkip().

E.g. current code will crash in assert for following transaction:

BEGIN;
SAVEPOINT one;
CREATE TABLE test_prepared_savepoints (a int);
PREPARE TRANSACTION 'x';
COMMIT PREPARED 'x';
:get_with2pc_nofilter
:get_with2pc_nofilter <- second call will crash decoder

With following backtrace:

frame #3: 0x000000010dc47b40 postgres`ExceptionalCondition(conditionName="!(txn->ninvalidations == 0)", errorType="FailedAssertion", fileName="reorderbuffer.c", lineNumber=1944) at assert.c:54
frame #4: 0x000000010d9ff4dc postgres`ReorderBufferForget(rb=0x00007fe1ab832318, xid=816, lsn=35096144) at reorderbuffer.c:1944
frame #5: 0x000000010d9f055c postgres`DecodePrepare(ctx=0x00007fe1ab81b918, buf=0x00007ffee2650408, parsed=0x00007ffee2650088) at decode.c:703
frame #6: 0x000000010d9ef718 postgres`DecodeXactOp(ctx=0x00007fe1ab81b918, buf=0x00007ffee2650408) at decode.c:310

That can be fixed by calling SnapBuildCommitTxn() in DecodePrepare()
which I believe is safe because during normal work prepared transaction
holds relation locks until commit/abort and in between nobody can access
altered relations (or just I don’t know such situations — that was the reason
why i had marked that xids as running in previous versions).

> On 6 Feb 2018, at 15:20, Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com> wrote:
>
> Hi all,
>
>>
>> PFA, patch which applies cleanly against latest git head. I also
>> removed unwanted newlines and took care of the cleanup TODO about
>> making ReorderBufferTXN structure using a txn_flags field instead of
>> separate booleans for various statuses like has_catalog_changes,
>> is_subxact, is_serialized etc. The patch uses this txn_flags field for
>> the newer prepare related info as well.
>>
>> "make check-world" passes ok, including the additional regular and tap
>> tests that we have added as part of this patch.
>>
>
> PFA, latest version of this patch.
>
> This latest version takes care of the abort-while-decoding issue along
> with additional test cases and documentation changes.
>
>

--
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2018-02-08 12:45:52 Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Previous Message Amit Kapila 2018-02-08 12:25:44 Re: In logical replication concurrent update of partition key creates a duplicate record on standby.