Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Oh, Mike" <minsoo(at)amazon(dot)com>
Subject: Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Date: 2021-10-11 06:27:41
Message-ID: CAD21AoD00wV4gt-53ze+ZB8n4bqJrdH8J_UnDHddy8S2A+a25g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

.

On Fri, Oct 8, 2021 at 4:50 PM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> At Thu, 7 Oct 2021 13:20:14 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in
> > Another idea to fix this problem would be that before calling
> > SnapBuildCommitTxn() we create transaction entries in ReorderBuffer
> > for (sub)transactions whose COMMIT record has XACT_XINFO_HAS_INVALS,
> > and then mark all of them as catalog-changed by calling
> > ReorderBufferXidSetCatalogChanges(). I've attached a PoC patch for
> > this idea. What the patch does is essentially the same as what the
> > proposed patch does. But the patch doesn't modify the
> > SnapBuildCommitTxn(). And we remember the list of last running
> > transactions in reorder buffer and the list is periodically purged
> > during decoding RUNNING_XACTS records, eventually making it empty.
>
> I came up with the third way. SnapBuildCommitTxn already properly
> handles the case where a ReorderBufferTXN with
> RBTXN_HAS_CATALOG_CHANGES. So this issue can be resolved by create
> such ReorderBufferTXNs in SnapBuildProcessRunningXacts.

Thank you for the idea and patch!

It's much simpler than mine. I think that creating an entry of a
catalog-changed transaction in the reorder buffer before
SunapBuildCommitTxn() is the right direction.

After more thought, given DDLs are not likely to happen than DML in
practice, probably we can always mark both the top transaction and its
subtransactions as containing catalog changes if the commit record has
XACT_XINFO_HAS_INVALS? I believe this is not likely to lead to
overhead in practice. That way, the patch could be more simple and
doesn't need the change of AssertTXNLsnOrder().

I've attached another PoC patch. Also, I've added the tests for this
issue in test_decoding.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

Attachment Content-Type Size
poc_mark_all_txns_catalog_change.patch application/octet-stream 5.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2021-10-11 06:39:05 Re: Added schema level support for publication.
Previous Message Fujii Masao 2021-10-11 06:24:46 Re: Inconsistency in startup process's MyBackendId and procsignal array registration with ProcSignalInit()