Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Oh, Mike" <minsoo(at)amazon(dot)com>
Subject: Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Date: 2022-07-08 01:14:25
Message-ID: CAD21AoCF=8SKOJYWF12fPEwXc5GLU2SvhfZ3QoR93Qqn+6oeSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 7, 2022 at 3:40 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Jul 7, 2022 at 8:21 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Wed, Jul 6, 2022 at 5:55 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Jul 6, 2022 at 12:19 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > >
> > > > On Tue, Jul 5, 2022 at 8:00 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > >
> > > > > 2. Are we anytime removing transaction ids from catchanges->xip array?
> > > >
> > > > No.
> > > >
> > > > > If not, is there a reason for the same? I think we can remove it
> > > > > either at commit/abort or even immediately after adding the xid/subxid
> > > > > to committed->xip array.
> > > >
> > > > It might be a good idea but I'm concerned that removing XID from the
> > > > array at every commit/abort or after adding it to committed->xip array
> > > > might be costly as it requires adjustment of the array to keep its
> > > > order. Removing XIDs from the array would make bsearch faster but the
> > > > array is updated reasonably often (every 15 sec).
> > > >
> > >
> > > Fair point. However, I am slightly worried that we are unnecessarily
> > > searching in this new array even when ReorderBufferTxn has the
> > > required information. To avoid that, in function
> > > SnapBuildXidHasCatalogChange(), we can first check
> > > ReorderBufferXidHasCatalogChanges() and then check the array if the
> > > first check doesn't return true. Also, by the way, do we need to
> > > always keep builder->catchanges.xip updated via SnapBuildRestore()?
> > > Isn't it sufficient that we just read and throw away contents from a
> > > snapshot if builder->catchanges.xip is non-NULL?
> >
> > IIUC catchanges.xip is restored only once when restoring a consistent
> > snapshot via SnapBuildRestore(). I think it's necessary to set
> > catchanges.xip for later use in SnapBuildXidHasCatalogChange(). Or did
> > you mean via SnapBuildSerialize()?∫
> >
>
> Sorry, I got confused about the way restore is used. You are right, it
> will be done once. My main worry is that we shouldn't look at
> builder->catchanges.xip array on an ongoing basis which I think can be
> dealt with by one of the ideas you mentioned below. But, I think we
> can still follow the other suggestion related to moving
> ReorderBufferXidHasCatalogChanges() check prior to checking array.

Agreed. I've incorporated this change in the new version patch.

>
> > >
> > > I had additionally thought if can further optimize this solution to
> > > just store this additional information when we need to serialize for
> > > checkpoint record but I think that won't work because walsender can
> > > restart even without resatart of server in which case the same problem
> > > can occur.
> >
> > Yes, probably we need to write catalog modifying transactions for
> > every serialized snapshot.
> >
> > > I am not if sure there is a way to further optimize this
> > > solution, let me know if you have any ideas?
> >
> > I suppose that writing additional information to serialized snapshots
> > would not be a noticeable overhead since we need 4 bytes per
> > transaction and we would not expect there is a huge number of
> > concurrent catalog modifying transactions. But both collecting catalog
> > modifying transactions (especially when there are many ongoing
> > transactions) and bsearch'ing on the XID list every time decoding the
> > COMMIT record could bring overhead.
> >
> > A solution for the first point would be to keep track of catalog
> > modifying transactions by using a linked list so that we can avoid
> > checking all ongoing transactions.
> >
>
> This sounds reasonable to me.
>
> > Regarding the second point, on reflection, I think we need to look up
> > the XID list until all XID in the list is committed/aborted. We can
> > remove XIDs from the list after adding it to committed.xip as you
> > suggested. Or when decoding a RUNNING_XACTS record, we can remove XIDs
> > older than builder->xmin from the list like we do for committed.xip in
> > SnapBuildPurgeCommittedTxn().
> >
>
> I think doing along with RUNNING_XACTS should be fine. At each
> commit/abort, the cost could be high because we need to maintain the
> sort order. In general, I feel any one of these should be okay because
> once the array becomes empty, it won't be used again and there won't
> be any operation related to it during ongoing replication.

I've attached the new version patch that incorporates the comments and
the optimizations discussed above.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

Attachment Content-Type Size
v2-0001-Add-catalog-modifying-transactions-to-logical-dec.patch application/octet-stream 23.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2022-07-08 01:56:52 Re: remove more archiving overhead
Previous Message Robert Haas 2022-07-08 01:13:59 Re: DropRelFileLocatorBuffers