Re: [HACKERS] logical decoding of two-phase transactions

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Ajin Cherian <itsajin(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Date: 2021-03-29 07:38:25
Message-ID: CALDaNm2ZnJeG23bE+gEOQEmXo8N+fs2g4=xuH2u6nNcX0s9Jjg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 21, 2021 at 1:07 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Sat, Mar 20, 2021 at 10:09 AM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> >
> > On Sat, Mar 20, 2021 at 1:35 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >>
> >> On Fri, Mar 19, 2021 at 5:03 AM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> >> >
> >> > Missed the patch - 0001, resending.
> >> >
> >>
> >> I have made miscellaneous changes in the patch which includes
> >> improving comments, error messages, and miscellaneous coding
> >> improvements. The most notable one is that we don't need an additional
> >> parameter in walrcv_startstreaming, if the two_phase option is set
> >> properly. My changes are in v63-0002-Misc-changes-by-Amit, if you are
> >> fine with those, then please merge them in the next version. I have
> >> omitted the dev-logs patch but feel free to submit it. I have one
> >> question:
> >>
> >
> > I am fine with these changes. I see that Peter has already merged in these changes.
> >
>
> I have further updated the patch to implement unique GID on the
> subscriber-side as discussed in the nearby thread [1]. That requires
> some changes in the test. Additionally, I have updated some comments
> and docs. Let me know what do you think about the changes?
>

+static void
+TwoPhaseTransactionGid(RepOriginId originid, TransactionId xid,
+ char *gid, int szgid)
+{
+ /* Origin and Transaction ids must be valid */
+ Assert(originid != InvalidRepOriginId);
+ Assert(TransactionIdIsValid(xid));
+
+ snprintf(gid, szgid, "pg_%u_%u", originid, xid);
+}

I found one issue in the current mechanism that we use to generate the
GID's. In one of the scenarios it will generate the same GID's, steps
for the same is given below:
---- setup 2 publisher and one subscriber with synchronous_standby_names
prepare txn 't1' on publisher1 (This prepared txn is prepared as
pg_1_542 on subscriber)
drop subscription of publisher1
create subscription subscriber for publisher2 (We have changed the
subscription to subscribe to publisher2 which was earlier subscribing
to publisher1)
prepare txn 't2' on publisher2 (This prepared txn also uses pg_1_542
on subscriber even though user has given a different gid)

This prepared txn keeps waiting for it to complete in the subscriber,
but never completes. Here user uses different gid for prepared
transaction but it ends up using the same gid at the subscriber. The
subscriber keeps failing with:
2021-03-22 10:14:57.859 IST [73959] ERROR: transaction identifier
"pg_1_542" is already in use
2021-03-22 10:14:57.860 IST [73868] LOG: background worker "logical
replication worker" (PID 73959) exited with exit code 1

Attached file has the steps for it.
This might be a rare scenario, may or may not be a user scenario,
Should we handle this scenario?

Regards,
Vignesh

Attachment Content-Type Size
possible_bug.sh application/x-shellscript 2.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-03-29 07:50:46 Re: Proposal: Save user's original authenticated identity for logging
Previous Message Arne Roland 2021-03-29 07:37:53 Re: Rename of triggers for partitioned tables