Re: [HACKERS] logical decoding of two-phase transactions

From: Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, David Steele <david(at)pgmasters(dot)net>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Sokolov Yura <y(dot)sokolov(at)postgrespro(dot)ru>, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Date: 2018-08-01 14:00:21
Message-ID: CAMGcDxeqEpWj3fTXwqhSwBdXd2RS9jzwWscO-XbeCfso6ts3+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi,

PFA, latest patchset which incorporates the additional feedback.

>>> There's an additional test case in
>>> 0005-Additional-test-case-to-demonstrate-decoding-rollbac.patch which
>>> uses a sleep in the "change" plugin API to allow a concurrent rollback
>>> on the 2PC being currently decoded. Andres generally doesn't like this
>>> approach :-), but there are no timing/interlocking issues now, and the
>>> sleep just helps us do a concurrent rollback, so it might be ok now,
>>> all things considered. Anyways, it's an additional patch for now.
>>
>> Yea, I still don't think it's ok. The tests won't be reliable. There's
>> ways to make this reliable, e.g. by forcing a lock to be acquired that's
>> externally held or such. Might even be doable just with a weird custom
>> datatype.
>>
>
> Ok, I will look at ways to do away with the sleep.
>

The attached patchset implements a non-sleep based approached by
sending the 2PC XID to the pg_logical_slot_get_changes() function as
an option for the test_decoding plugin. So, an example invocation
will now look like:

SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL,
NULL, 'skip-empty-xacts', '1', 'check-xid', '$xid2pc');

The test_decoding pg_decode_change() API if it sees a valid xid
argument will wait for it to be aborted. Another backend can then come
in and merrily abort this ongoing 2PC in the background. Once it's
aborted, the pg_decode_change API will go ahead and will hit an ERROR
in the systable scan APIs. That should take care of Andres' concern
about using sleep in the tests. The relevant tap test has been added
to this patchset.

>>> @@ -423,6 +423,16 @@ systable_getnext(SysScanDesc sysscan)
>>> else
>>> htup = heap_getnext(sysscan->scan, ForwardScanDirection);
>>>
>>> + /*
>>> + * If CheckXidAlive is valid, then we check if it aborted. If it did, we
>>> + * error out
>>> + */
>>> + if (TransactionIdIsValid(CheckXidAlive) &&
>>> + TransactionIdDidAbort(CheckXidAlive))
>>> + ereport(ERROR,
>>> + (errcode(ERRCODE_TRANSACTION_ROLLBACK),
>>> + errmsg("transaction aborted during system catalog scan")));
>>> +
>>> return htup;
>>> }
>>
>> Don't we have to check TransactionIdIsInProgress() first? C.f. header
>> comments in tqual.c. Note this is also not guaranteed to be correct
>> after a crash (where no clog entry will exist for an aborted xact), but
>> we probably shouldn't get here in that case - but better be safe.
>>
>> I suspect it'd be better reformulated as
>> TransactionIdIsValid(CheckXidAlive) &&
>> !TransactionIdIsInProgress(CheckXidAlive) &&
>> !TransactionIdDidCommit(CheckXidAlive)
>>
>> What do you think?
>>

Modified the checks are per the above suggestion.

> I was wondering if anything else would be needed for user-defined
> catalog tables..
>

We don't need to do anything else for user-defined catalog tables
since they will also get accessed via the systable_* scan APIs.

>
> Hmm, lemme see if we can do it outside of the plugin. But note that a
> plugin might want to decode some 2PC at prepare time and another are
> "commit prepared" time.
>

The test_decoding pg_decode_filter_prepare() API implements a simple
filter strategy now. If the GID contains a substring "nodecode", then
it filters out decoding of such a 2PC at prepare time. Have added
steps to test this in the relevant test case in this patch.

I believe this patchset handles all pending issues along with relevant
test cases. Comments, further feedback appreciated.

Regards,
Nikhils
--
Nikhil Sontakke http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Cleaning-up-of-flags-in-ReorderBufferTXN-structure.patch application/octet-stream 7.9 KB
0002-Support-decoding-of-two-phase-transactions-at-PREPAR.patch application/octet-stream 40.1 KB
0003-Gracefully-handle-concurrent-aborts-of-uncommitted-t.patch application/octet-stream 9.3 KB
0004-Teach-test_decoding-plugin-to-work-with-2PC.patch application/octet-stream 28.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Emre Hasegeli 2018-08-01 14:07:47 Re: New Defects reported by Coverity Scan for PostgreSQL
Previous Message Tom Lane 2018-08-01 13:57:04 Re: New Defects reported by Coverity Scan for PostgreSQL