Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Oh, Mike" <minsoo(at)amazon(dot)com>
Subject: Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Date: 2022-05-30 05:42:56
Message-ID: CAD21AoC4x3uOw5rUcSYZkWob5s5ottGt_RPLxCEpHimFRDjrEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 25, 2022 at 12:11 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Tue, May 24, 2022 at 2:18 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Tue, May 24, 2022 at 7:58 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Mon, May 23, 2022 at 2:39 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > > On Mon, May 23, 2022 at 10:03 AM Kyotaro Horiguchi
> > > > <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > > > >
> > > > > At Sat, 21 May 2022 15:35:58 +0530, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote in
> > > > > > I think if we don't have any better ideas then we should go with
> > > > > > either this or one of the other proposals in this thread. The other
> > > > > > idea that occurred to me is whether we can somehow update the snapshot
> > > > > > we have serialized on disk about this information. On each
> > > > > > running_xact record when we serialize the snapshot, we also try to
> > > > > > purge the committed xacts (via SnapBuildPurgeCommittedTxn). So, during
> > > > > > that we can check if there are committed xacts to be purged and if we
> > > > > > have previously serialized the snapshot for the prior running xact
> > > > > > record, if so, we can update it with the list of xacts that have
> > > > > > catalog changes. If this is feasible then I think we need to somehow
> > > > > > remember the point where we last serialized the snapshot (maybe by
> > > > > > using builder->last_serialized_snapshot). Even, if this is feasible we
> > > > > > may not be able to do this in back-branches because of the disk-format
> > > > > > change required for this.
> > > > > >
> > > > > > Thoughts?
> > >
> > > It seems to work, could you draft the patch?
> > >
> >
> > I can help with the review and discussion.
>
> Okay, I'll draft the patch for this idea.

I've attached three POC patches:

poc_remember_last_running_xacts_v2.patch is a rebased patch of my
previous proposal[1]. This is based on the original proposal: we
remember the last-running-xacts list of the first decoded
RUNNING_XACTS record and check if the transaction whose commit record
has XACT_XINFO_HAS_INVALS and whose xid is in the list. This doesn’t
require any file format changes but the transaction will end up being
added to the snapshot even if it has only relcache invalidations.

poc_add_running_catchanges_xacts_to_serialized_snapshot.patch is a
patch for the idea Amit Kapila proposed with some changes. The basic
approach is to remember the list of xids that changed catalogs and
were running when serializing the snapshot. The list of xids is kept
in SnapShotBuilder and is serialized and restored to/from the
serialized snapshot. When decoding a commit record, we check if the
transaction is already marked as catalog-changes or its xid is in the
list. If so, we add it to the snapshot. Unlike the first patch, it can
add only transactions properly that have changed catalogs, but as Amit
mentioned before, this idea cannot be back patched as this changes the
on-disk format of the serialized snapshot.

poc_add_regression_tests.patch adds regression tests for this bug. The
regression tests are required for both HEAD and back-patching but I've
separated this patch for testing the above two patches easily.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

Attachment Content-Type Size
poc_add_running_catchanges_xacts_to_serialized_snapshot.patch application/x-patch 7.8 KB
poc_add_regression_tests.patch application/x-patch 2.4 KB
poc_remember_last_running_xacts_v2.patch application/x-patch 6.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2022-05-30 06:27:42 Re: doc: CREATE FOREIGN TABLE .. PARTITION OF .. DEFAULT
Previous Message Peter Eisentraut 2022-05-30 05:30:44 Re: Frontend error logging style