Re: test_decoding assertion failure for the loss of top-sub transaction relationship

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: dilipbalaut(at)gmail(dot)com, kuroda(dot)hayato(at)fujitsu(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, osumi(dot)takamichi(at)fujitsu(dot)com
Subject: Re: test_decoding assertion failure for the loss of top-sub transaction relationship
Date: 2022-09-02 10:27:03
Message-ID: CAA4eK1+A37ocd+2G__hiPdeKxyR1odwcBT7sk1CaRAZBVCJdEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 2, 2022 at 12:25 PM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> At Fri, 2 Sep 2022 11:27:23 +0530, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote in
> > On Fri, Sep 2, 2022 at 11:25 AM kuroda(dot)hayato(at)fujitsu(dot)com
> > <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> > > How about following?
> > >
> > > diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
> > > index bf72ad45ec..a630522907 100644
> > > --- a/src/backend/replication/logical/snapbuild.c
> > > +++ b/src/backend/replication/logical/snapbuild.c
> > > @@ -1086,8 +1086,17 @@ SnapBuildCommitTxn(SnapBuild *builder, XLogRecPtr lsn, TransactionId xid,
> > > }
> > > }
> > >
> > > - /* if top-level modified catalog, it'll need a snapshot */
> > > - if (SnapBuildXidHasCatalogChanges(builder, xid, xinfo))
> > > + /*
> > > + * if top-level or one of sub modified catalog, it'll need a snapshot.
> > > + *
> > > + * Normally the second check is not needed because the relation between
> > > + * top-sub transactions is tracked on the ReorderBuffer layer, and the top
> > > + * transaction is marked as containing catalog changes if its children are.
> > > + * But in some cases the relation may be missed, in which case only the sub
> > > + * transaction may be marked as containing catalog changes.
> > > + */
> > > + if (SnapBuildXidHasCatalogChanges(builder, xid, xinfo)
> > > + || sub_needs_timetravel)
> > > {
> > > elog(DEBUG2, "found top level transaction %u, with catalog changes",
> > > xid);
> > > @@ -1095,11 +1104,6 @@ SnapBuildCommitTxn(SnapBuild *builder, XLogRecPtr lsn, TransactionId xid,
> > > needs_timetravel = true;
> > > SnapBuildAddCommittedTxn(builder, xid);
> > > }
> > > - else if (sub_needs_timetravel)
> > > - {
> > > - /* track toplevel txn as well, subxact alone isn't meaningful */
> > > - SnapBuildAddCommittedTxn(builder, xid);
> > > - }
> > > else if (needs_timetravel)
> > > {
> > > elog(DEBUG2, "forced transaction %u to do timetravel", xid);
> >
> > Yeah, I am fine with this as well.
>
> I'm basically fine, too. But this is a bug that needs back-patching
> back to 10.
>

I have not verified but I think we need to backpatch this till 14
because prior to that in DecodeCommit, we use to set the top-level txn
as having catalog changes based on the if there are invalidation
messages in the commit record. So, in the current scenario shared by
Osumi-San, before SnapBuildCommitTxn(), the top-level txn will be
marked as having catalog changes.

> This change changes the condition for the DEBUG2 message.
> So we need to add an awkward if() condition for the DEBUG2 message.
> Looking that the messages have different debug-level, I doubt there
> have been a chance they are useful. If we remove the two DEBUGx
> messages, I'm fine with the change.
>

I think these DEBUG2 messages could be useful, so instead of removing
these, I suggest we should follow Dilip's proposed fix and maybe add a
new DEBUG2 message on the lines of (("forced transaction %u to do
timetravel due to one of its subtransaction", xid) in the else if
(sub_needs_timetravel) condition if we think that will be useful too
but I am fine leaving the addition of new DEBUG2 message.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2022-09-02 10:29:01 Re: Handle infinite recursion in logical replication setup
Previous Message David Rowley 2022-09-02 10:06:48 Re: Clarify restriction on partitioned tables primary key / unique indexes