Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: sawada(dot)mshk(at)gmail(dot)com
Cc: bdrouvot(at)amazon(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, minsoo(at)amazon(dot)com
Subject: Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Date: 2021-10-14 02:21:28
Message-ID: 20211014.112128.1398650206608979125.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Mon, 11 Oct 2021 15:27:41 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in
> .
>
> On Fri, Oct 8, 2021 at 4:50 PM Kyotaro Horiguchi
> <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > I came up with the third way. SnapBuildCommitTxn already properly
> > handles the case where a ReorderBufferTXN with
> > RBTXN_HAS_CATALOG_CHANGES. So this issue can be resolved by create
> > such ReorderBufferTXNs in SnapBuildProcessRunningXacts.
>
> Thank you for the idea and patch!
>
> It's much simpler than mine. I think that creating an entry of a
> catalog-changed transaction in the reorder buffer before
> SunapBuildCommitTxn() is the right direction.
>
> After more thought, given DDLs are not likely to happen than DML in
> practice, probably we can always mark both the top transaction and its
> subtransactions as containing catalog changes if the commit record has
> XACT_XINFO_HAS_INVALS? I believe this is not likely to lead to
> overhead in practice. That way, the patch could be more simple and
> doesn't need the change of AssertTXNLsnOrder().
>
> I've attached another PoC patch. Also, I've added the tests for this
> issue in test_decoding.

Thanks for the test script. (I did that with TAP framework but
isolation tester version is simpler.)

It adds a call to ReorderBufferAssignChild but usually subtransactions
are assigned to top level elsewherae. Addition to that
ReorderBufferCommitChild() called just later does the same thing. We
are adding the third call to the same function, which looks a bit odd.

And I'm not sure it is wise to mark all subtransactions as "catalog
changed" always when the top transaction is XACT_XINFO_HAS_INVALS. The
reason I did that in the snapshiot building phase is to prevent adding
to DecodeCommit an extra code that is needed only while any
transaction running since before replication start is surviving.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-10-14 02:55:51 Re: Inconsistent behavior of pg_dump/pg_restore on DEFAULT PRIVILEGES
Previous Message Michael Paquier 2021-10-14 02:07:21 Re: BUG #17220: ALTER INDEX ALTER COLUMN SET (..) with an optionless opclass makes index and table unusable