Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Oh, Mike" <minsoo(at)amazon(dot)com>
Subject: Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Date: 2022-07-23 11:32:02
Message-ID: CAA4eK1+U1b7P2mMTZHGE0GaT4vYhGsShQx_6070a6xgwj9JQAQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 22, 2022 at 11:48 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Wed, Jul 20, 2022 at 5:50 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Wed, Jul 20, 2022 at 1:28 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> >
> > This is required if we don't want to introduce a new set of functions
> > as you proposed above. I am not sure which one is better w.r.t back
> > patching effort later but it seems to me using flag stuff would make
> > future back patches easier if we make any changes in
> > SnapBuildCommitTxn.
>
> Understood.
>
> I've implemented this idea as well for discussion. Both patches have
> the common change to remember the initial running transactions and to
> purge them when decoding xl_running_xacts records. The difference is
> how to mark the transactions as needing to be added to the snapshot.
>
> In v7-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch,
> when the transaction is in the initial running xact list and its
> commit record has XINFO_HAS_INVAL flag, we mark both the top
> transaction and its all subtransactions as containing catalog changes
> (which also means to create ReorderBufferTXN entries for them). These
> transactions are added to the snapshot in SnapBuildCommitTxn() since
> ReorderBufferXidHasCatalogChanges () for them returns true.
>
> In poc_mark_top_txn_has_inval.patch, when the transaction is in the
> initial running xacts list and its commit record has XINFO_HAS_INVALS
> flag, we set a new flag, say RBTXN_COMMIT_HAS_INVALS, only to the top
> transaction.
>

It seems that the patch has missed the part to check if the xid is in
the initial running xacts list?

> In SnapBuildCommitTxn(), we add all subtransactions to
> the snapshot without checking ReorderBufferXidHasCatalogChanges() for
> subtransactions if its top transaction has the RBTXN_COMMIT_HAS_INVALS
> flag.
>
> A difference between the two ideas is the scope of changes: the former
> changes only snapbuild.c but the latter changes both snapbuild.c and
> reorderbuffer.c. Moreover, while the former uses the existing flag,
> the latter adds a new flag to the reorder buffer for dealing with only
> this case. I think the former idea is simpler in terms of that. But,
> an advantage of the latter idea is that the latter idea can save to
> create ReorderBufferTXN entries for subtransactions.
>
> Overall I prefer the former for now but I'd like to hear what others think.
>

I agree that the latter idea can have better performance in extremely
special scenarios but introducing a new flag for the same sounds a bit
ugly to me. So, I would also prefer to go with the former idea,
however, I would also like to hear what Horiguchi-San and others have
to say.

Few comments on v7-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during:
1.
+void
+SnapBuildInitialXactSetCatalogChanges(SnapBuild *builder, TransactionId xid,
+ int subxcnt, TransactionId *subxacts,
+ XLogRecPtr lsn)
+{

I think it is better to name this function as
SnapBuildXIDSetCatalogChanges as we use this to mark a particular
transaction as having catalog changes.

2. Changed/added a few comments in the attached.

--
With Regards,
Amit Kapila.

Attachment Content-Type Size
v7-0001-diff-amit.patch application/octet-stream 2.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Martin Kalcher 2022-07-23 12:20:47 Re: [PATCH] Introduce array_shuffle() and array_sample()
Previous Message Aleksander Alekseev 2022-07-23 10:58:31 Re: Refactoring the regression tests for more independence