RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

From: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>
To: 'Masahiko Sawada' <sawada(dot)mshk(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Oh, Mike" <minsoo(at)amazon(dot)com>
Subject: RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Date: 2021-10-13 12:58:34
Message-ID: OSBPR01MB48886B287F714BF3B0F219A1EDB79@OSBPR01MB4888.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Monday, October 11, 2021 3:28 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> On Fri, Oct 8, 2021 at 4:50 PM Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
> wrote:
> >
> > At Thu, 7 Oct 2021 13:20:14 +0900, Masahiko Sawada
> > <sawada(dot)mshk(at)gmail(dot)com> wrote in
> > > Another idea to fix this problem would be that before calling
> > > SnapBuildCommitTxn() we create transaction entries in ReorderBuffer
> > > for (sub)transactions whose COMMIT record has
> XACT_XINFO_HAS_INVALS,
> > > and then mark all of them as catalog-changed by calling
> > > ReorderBufferXidSetCatalogChanges(). I've attached a PoC patch for
> > > this idea. What the patch does is essentially the same as what the
> > > proposed patch does. But the patch doesn't modify the
> > > SnapBuildCommitTxn(). And we remember the list of last running
> > > transactions in reorder buffer and the list is periodically purged
> > > during decoding RUNNING_XACTS records, eventually making it empty.
> >
> > I came up with the third way. SnapBuildCommitTxn already properly
> > handles the case where a ReorderBufferTXN with
> > RBTXN_HAS_CATALOG_CHANGES. So this issue can be resolved by
> create
> > such ReorderBufferTXNs in SnapBuildProcessRunningXacts.
>
> Thank you for the idea and patch!
>
> It's much simpler than mine. I think that creating an entry of a catalog-changed
> transaction in the reorder buffer before
> SunapBuildCommitTxn() is the right direction.
>
> After more thought, given DDLs are not likely to happen than DML in practice,
> probably we can always mark both the top transaction and its subtransactions
> as containing catalog changes if the commit record has
> XACT_XINFO_HAS_INVALS? I believe this is not likely to lead to overhead in
> practice. That way, the patch could be more simple and doesn't need the
> change of AssertTXNLsnOrder().
>
> I've attached another PoC patch. Also, I've added the tests for this issue in
> test_decoding.
I also felt that your patch addresses the problem in a good way.
Even without setting xid by NEW_CID decoding like in the original scenario,
we can set catalog change flag.

One really minor comment I have is,
in DecodeCommit(), you don't need to declar i. It's defined at the top of the function.

+ for (int i = 0; i < parsed->nsubxacts; i++)

Best Regards,
Takamichi Osumi

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amul Sul 2021-10-13 13:05:33 Re: when the startup process doesn't (logging startup delays)
Previous Message Andrew Dunstan 2021-10-13 12:55:38 Re: [RFC] building postgres with meson