From: | Ajin Cherian <itsajin(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] logical decoding of two-phase transactions |
Date: | 2020-11-30 09:05:49 |
Message-ID: | CAFPTHDYy7_50QzsXbYp4cr2L-PiOZ0+ya=+xwzYzLo0UWRO=Gg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Nov 29, 2020 at 1:07 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Then once you found which existing test covers
> that, you can try to generate prepared transaction behavior as
> mentioned above.
I was able to find out the test case that exercises that code, it is
the ondisk_startup spec in test_decoding. Using that, I was able to
create the problem with the following setup:
Using 4 sessions (this could be optimized to 3, but just sharing what
I've tested):
s1(session 1):
begin;
postgres=# begin;
BEGIN
postgres=*# SELECT pg_current_xact_id();
pg_current_xact_id
--------------------
546
(1 row)
--------------------the above commands leave a transaction running
s2:
CREATE TABLE do_write(id serial primary key);
SELECT 'init' FROM
pg_create_logical_replication_slot('isolation_slot', 'test_decoding');
---------------------this will hang because of 546 txn is pending
s3:
postgres=# begin;
BEGIN
postgres=*# SELECT pg_current_xact_id();
pg_current_xact_id
--------------------
547
(1 row)
-------------------------------- leave another txn pending---
s1:
postgres=*# ALTER TABLE do_write ADD COLUMN addedbys2 int;
ALTER TABLE
postgres=*# commit;
------------------------------commit the first txn; this will cause
state to move to SNAPBUILD_FULL_SNAPSHOT state
2020-11-30 03:31:07.354 EST [16312] LOG: logical decoding found
initial consistent point at 0/1730A18
2020-11-30 03:31:07.354 EST [16312] DETAIL: Waiting for transactions
(approximately 1) older than 553 to end.
s4:
postgres=# begin;
BEGIN
postgres=*# INSERT INTO do_write DEFAULT VALUES;
INSERT 0 1
postgres=*# prepare transaction 'test1';
PREPARE TRANSACTION
-------------- leave this transaction prepared
s3:
postgres=*# commit;
COMMIT
----------------- this will cause s2 call to return and a consistent
point has been reached.
2020-11-30 03:31:34.200 EST [16312] LOG: logical decoding found
consistent point at 0/1730D58
s4:
commit prepared 'test1';
s2:
postgres=# SELECT * FROM pg_logical_slot_get_changes('isolation_slot',
NULL, NULL, 'two-phase-commit', '1', 'include-xids', '0',
'skip-empty-xacts', '1');
lsn | xid | data
-----------+-----+-------------------------
0/1730FC8 | 553 | COMMIT PREPARED 'test1'
(1 row)
In pg_logical_slot_get_changes() we see only the Commit Prepared but
no insert and no prepare command. I debugged this and I see that in
DecodePrepare, the
prepare is skipped because the prepare lsn is prior to the
start_decoding_at point and is skipped in SnapBuildXactNeedsSkip. So,
the reason for skipping
the PREPARE is similar to the reason why it would have been skipped on
a restart after a previous decode run.
One possible fix would be similar to what you suggested, in
DecodePrepare , add the check DecodingContextReady(ctx), which if
false would indicate that the
PREPARE was prior to a consistent snapshot and if so, set a flag value
in txn accordingly (say RBTXN_PREPARE_NOT_DECODED?), and if this flag
is detected
while handling the COMMIT PREPARED, then handle it like you would
handle a COMMIT. This would ensure that all the changes of the
transaction are sent out
and at the same time, the subscriber side does not need to try and
handle a prepared transaction that does not exist on its side.
Let me know what you think of this?
regards,
Ajin Cherian
Fujitsu Australia
From | Date | Subject | |
---|---|---|---|
Next Message | Dilip Kumar | 2020-11-30 09:10:08 | Re: Is Recovery actually paused? |
Previous Message | Heikki Linnakangas | 2020-11-30 08:32:43 | Re: obsolete comment from WITH OIDS days |