Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Oh, Mike" <minsoo(at)amazon(dot)com>
Subject: Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Date: 2022-07-14 01:32:06
Message-ID: CAD21AoAG9v-YPaSxJjZK0uNOfY5onAXJbqkW=g33oGf7tJHZDg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 12, 2022 at 12:40 PM shiy(dot)fnst(at)fujitsu(dot)com
<shiy(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Tue, Jul 12, 2022 8:49 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > I've attached an updated patch.
> >
> > While trying this idea, I noticed there is no API to get the length of
> > dlist, as we discussed offlist. Alternative idea was to use List
> > (T_XidList) but I'm not sure it's a great idea since deleting an xid
> > from the list is O(N), we need to implement list_delete_xid, and we
> > need to make sure allocating list node in the reorder buffer context.
> > So in the patch, I added a variable, catchange_ntxns, to keep track of
> > the length of the dlist. Please review it.
> >
>
> Thanks for your patch. Here are some comments on the master patch.

Thank you for the comments.

>
> 1.
> In catalog_change_snapshot.spec, should we use "RUNNING_XACTS record" instead of
> "RUNNING_XACT record" / "XACT_RUNNING record" in the comment?
>
> 2.
> + * Since catchange.xip is sorted, we find the lower bound of
> + * xids that sill are interesting.
>
> Typo?
> "sill" -> "still"
>
> 3.
> + * This array is set once when restoring the snapshot, xids are removed
> + * from the array when decoding xl_running_xacts record, and then eventually
> + * becomes an empty.
>
> + /* catchange list becomes an empty */
> + pfree(builder->catchange.xip);
> + builder->catchange.xip = NULL;
>
> Should "becomes an empty" be modified to "becomes empty"?
>
> 4.
> + * changes that are smaller than ->xmin. Those won't ever get checked via
> + * the ->committed array and ->catchange, respectively. The committed xids will
>
> Should we change
> "the ->committed array and ->catchange"
> to
> "the ->committed or ->catchange array"
> ?

Agreed with all the above comments. These are incorporated in the
latest v4 patch I just sent[1].

Regards,

[1] https://www.postgresql.org/message-id/CAD21AoAyNPrOFg%2BQGh%2B%3D4205TU0%3DyrE%2BQyMgzStkH85uBZXptQ%40mail.gmail.com

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message shiy.fnst@fujitsu.com 2022-07-14 02:16:00 RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Previous Message Masahiko Sawada 2022-07-14 01:30:31 Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns