Re: [HACKERS] logical decoding of two-phase transactions

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Ajin Cherian <itsajin(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Date: 2020-11-13 12:30:29
Message-ID: CAA4eK1LyzdpUo6A_LbxqoAu_aT-y5DpXi1xt9oCy_DT6CTd18w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 11, 2020 at 4:30 PM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
>
> Did some further tests on the problem I saw and I see that it does not
> have anything to do with this patch. I picked code from top of head.
> If I have enough changes in a transaction to initiate streaming, then
> it will also stream the same changes after a commit.
>
> BEGIN;
> INSERT INTO stream_test SELECT repeat('a', 10) || g.i FROM
> generate_series(1,800) g(i);
> SELECT data FROM pg_logical_slot_get_changes('regression_slot',
> NULL,NULL, 'two-phase-commit', '1', 'include-xids', '0',
> 'skip-empty-xacts', '1', 'stream-changes', '1');
> ** see streamed output here **
> END;
> SELECT data FROM pg_logical_slot_get_changes('regression_slot',
> NULL,NULL, 'two-phase-commit', '1', 'include-xids', '0',
> 'skip-empty-xacts', '1', 'stream-changes', '1');
> ** see the same streamed output here **
>
> I think this is because since the transaction has not been committed,
> SnapBuildCommitTxn is not called which is what moves the
> "builder->start_decoding_at", and as a result
> later calls to pg_logical_slot_get_changes will start from the
> previous lsn.
>

No, we always move start_decoding_at after streaming changes. It will
be moved because we have advanced the confirmed_flush location after
streaming all the changes (via LogicalConfirmReceivedLocation()) which
will be used to set 'start_decoding_at' when we create decoding
context (CreateDecodingContext) next time. However, we don't advance
'restart_lsn' due to which it start from the same point and accumulate
all changes for transaction each time. Now, after Commit we get an
extra record which is ahead of 'start_decoding_at' and we try to
decode it, it will get all the changes of the transaction. It might be
that we update the documentation for pg_logical_slot_get_changes() to
indicate the same but I don't think this is a problem.

> I did do a quick test in pgoutput using pub/sub and I
> dont see duplication of data there but I haven't
> verified exactly what happens.
>

Yeah, because we always move ahead for WAL locations in that unless
the subscriber/publisher is restarted in which case it should start
from the required location. But still, we can try to see if there is
any bug.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Antonin Houska 2020-11-13 12:39:23 Re: POC: Cleaning up orphaned files using undo logs
Previous Message Ajin Cherian 2020-11-13 12:26:36 Re: [HACKERS] logical decoding of two-phase transactions