Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Oh, Mike" <minsoo(at)amazon(dot)com>
Subject: Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Date: 2022-07-06 02:07:32
Message-ID: CAD21AoBeMVfY6dvg5J8XmSHGbc3+yTA8k_GVRzwZR24zc4TfSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 4, 2022 at 9:42 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, May 30, 2022 at 11:13 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > I've attached three POC patches:
> >
>
> I think it will be a good idea if you can add a short commit message
> at least to say which patch is proposed for HEAD and which one is for
> back branches. Also, it would be good if you can add some description
> of the fix in the commit message. Let's remove poc* from the patch
> name.

Updated.

>
> Review poc_add_running_catchanges_xacts_to_serialized_snapshot
> =====================================================
> 1.
> + /*
> + * Array of transactions that were running when the snapshot serialization
> + * and changed system catalogs,
>
> The part of the sentence after serialization is not very clear.

Updated.

>
> 2.
> - if (ReorderBufferXidHasCatalogChanges(builder->reorder, subxid))
> + if (ReorderBufferXidHasCatalogChanges(builder->reorder, subxid) ||
> + bsearch(&xid, builder->catchanges.xip, builder->catchanges.xcnt,
> + sizeof(TransactionId), xidComparator) != NULL)
>
> Why are you using xid instead of subxid in bsearch call? Can we add a
> comment to say why it is okay to use xid if there is a valid reason?
> But note, we are using subxid to add to the committed xact array so
> not sure if this is a good idea but I might be missing something.

You're right, subxid should be used here.

>
> Suggestions for improvement in comments:
> - /*
> - * Update the transactions that are running and changes
> catalogs that are
> - * not committed.
> - */
> + /* Update the catalog modifying transactions that are yet not
> committed. */
> if (builder->catchanges.xip)
> pfree(builder->catchanges.xip);
> builder->catchanges.xip =
> ReorderBufferGetCatalogChangesXacts(builder->reorder,
> @@ -1647,7 +1644,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn)
> COMP_CRC32C(ondisk->checksum, ondisk_c, sz);
> ondisk_c += sz;
>
> - /* copy catalog-changes xacts */
> + /* copy catalog modifying xacts */
> sz = sizeof(TransactionId) * builder->catchanges.xcnt;
> memcpy(ondisk_c, builder->catchanges.xip, sz);
> COMP_CRC32C(ondisk->checksum, ondisk_c, sz);

Updated.

I'll post a new version patch in the next email with replying to other comments.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2022-07-06 02:21:50 Re: AIX support - alignment issues
Previous Message Michael Paquier 2022-07-06 01:50:10 Re: pg_upgrade (12->14) fails on aggregate