Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Oh, Mike" <minsoo(at)amazon(dot)com>
Subject: Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Date: 2022-07-07 06:40:06
Message-ID: CAA4eK1KMegGo+QwMxJcHMbYN-3QOLH6hD2thWXA2_M3MLH01Ug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 7, 2022 at 8:21 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Wed, Jul 6, 2022 at 5:55 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Wed, Jul 6, 2022 at 12:19 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Tue, Jul 5, 2022 at 8:00 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > > 2. Are we anytime removing transaction ids from catchanges->xip array?
> > >
> > > No.
> > >
> > > > If not, is there a reason for the same? I think we can remove it
> > > > either at commit/abort or even immediately after adding the xid/subxid
> > > > to committed->xip array.
> > >
> > > It might be a good idea but I'm concerned that removing XID from the
> > > array at every commit/abort or after adding it to committed->xip array
> > > might be costly as it requires adjustment of the array to keep its
> > > order. Removing XIDs from the array would make bsearch faster but the
> > > array is updated reasonably often (every 15 sec).
> > >
> >
> > Fair point. However, I am slightly worried that we are unnecessarily
> > searching in this new array even when ReorderBufferTxn has the
> > required information. To avoid that, in function
> > SnapBuildXidHasCatalogChange(), we can first check
> > ReorderBufferXidHasCatalogChanges() and then check the array if the
> > first check doesn't return true. Also, by the way, do we need to
> > always keep builder->catchanges.xip updated via SnapBuildRestore()?
> > Isn't it sufficient that we just read and throw away contents from a
> > snapshot if builder->catchanges.xip is non-NULL?
>
> IIUC catchanges.xip is restored only once when restoring a consistent
> snapshot via SnapBuildRestore(). I think it's necessary to set
> catchanges.xip for later use in SnapBuildXidHasCatalogChange(). Or did
> you mean via SnapBuildSerialize()?∫
>

Sorry, I got confused about the way restore is used. You are right, it
will be done once. My main worry is that we shouldn't look at
builder->catchanges.xip array on an ongoing basis which I think can be
dealt with by one of the ideas you mentioned below. But, I think we
can still follow the other suggestion related to moving
ReorderBufferXidHasCatalogChanges() check prior to checking array.

> >
> > I had additionally thought if can further optimize this solution to
> > just store this additional information when we need to serialize for
> > checkpoint record but I think that won't work because walsender can
> > restart even without resatart of server in which case the same problem
> > can occur.
>
> Yes, probably we need to write catalog modifying transactions for
> every serialized snapshot.
>
> > I am not if sure there is a way to further optimize this
> > solution, let me know if you have any ideas?
>
> I suppose that writing additional information to serialized snapshots
> would not be a noticeable overhead since we need 4 bytes per
> transaction and we would not expect there is a huge number of
> concurrent catalog modifying transactions. But both collecting catalog
> modifying transactions (especially when there are many ongoing
> transactions) and bsearch'ing on the XID list every time decoding the
> COMMIT record could bring overhead.
>
> A solution for the first point would be to keep track of catalog
> modifying transactions by using a linked list so that we can avoid
> checking all ongoing transactions.
>

This sounds reasonable to me.

> Regarding the second point, on reflection, I think we need to look up
> the XID list until all XID in the list is committed/aborted. We can
> remove XIDs from the list after adding it to committed.xip as you
> suggested. Or when decoding a RUNNING_XACTS record, we can remove XIDs
> older than builder->xmin from the list like we do for committed.xip in
> SnapBuildPurgeCommittedTxn().
>

I think doing along with RUNNING_XACTS should be fine. At each
commit/abort, the cost could be high because we need to maintain the
sort order. In general, I feel any one of these should be okay because
once the array becomes empty, it won't be used again and there won't
be any operation related to it during ongoing replication.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2022-07-07 06:43:11 Re: Schema variables - new implementation for Postgres 15
Previous Message Justin Pryzby 2022-07-07 06:22:55 Re: pg_upgrade allows itself to be run twice