Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Oh, Mike" <minsoo(at)amazon(dot)com>
Subject: Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Date: 2022-05-21 10:05:58
Message-ID: CAA4eK1L_e4r4=10TTzoOiWGQ6hH4vxWAwD=0aofz04czCTV0kQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 11, 2021 at 11:58 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> It's much simpler than mine. I think that creating an entry of a
> catalog-changed transaction in the reorder buffer before
> SunapBuildCommitTxn() is the right direction.
>
> After more thought, given DDLs are not likely to happen than DML in
> practice, probably we can always mark both the top transaction and its
> subtransactions as containing catalog changes if the commit record has
> XACT_XINFO_HAS_INVALS?
>

I have some observations and thoughts on this work.

1.
+# For the transaction that TRUNCATEd the table tbl1, the last decoding decodes
+# only its COMMIT record, because it starts from the RUNNING_XACT
record emitted
+# during the second checkpoint execution. This transaction must be marked as
+# containing catalog changes during decoding the COMMIT record and the decoding
+# of the INSERT record must read the pg_class with the correct
historic snapshot.
+permutation "s0_init" "s0_begin" "s0_savepoint" "s0_truncate"
"s1_checkpoint" "s1_get_changes" "s0_commit" "s0_begin" "s0_insert"
"s1_checkpoint" "s1_get_changes" "s0_commit" "s1_get_changes"

In the first line of comment, do you want to say "... record emitted
during the first checkpoint" because only then it can start from the
commit record of the transaction that has performed truncate.

2.
+ /*
+ * Mark the top transaction and its subtransactions as containing catalog
+ * changes, if the commit record has invalidation message. This is necessary
+ * for the case where we decode only the commit record of the transaction
+ * that actually has done catalog changes.
+ */
+ if (parsed->xinfo & XACT_XINFO_HAS_INVALS)
+ {
+ ReorderBufferXidSetCatalogChanges(ctx->reorder, xid, buf->origptr);
+
+ for (int i = 0; i < parsed->nsubxacts; i++)
+ {
+ ReorderBufferAssignChild(ctx->reorder, xid, parsed->subxacts[i],
+ buf->origptr);
+ ReorderBufferXidSetCatalogChanges(ctx->reorder, parsed->subxacts[i],
+ buf->origptr);
+ }
+ }
+
SnapBuildCommitTxn(ctx->snapshot_builder, buf->origptr, xid,
parsed->nsubxacts, parsed->subxacts);

Marking it before SnapBuildCommitTxn has one disadvantage that we
sometimes do this work even if the snapshot state is SNAPBUILD_START
or SNAPBUILD_BUILDING_SNAPSHOT in which case SnapBuildCommitTxn
wouldn't do anything. Now, whereas this will fix the issue but it
seems we need to do this work even when we would have already marked
the txn has catalog changes, and then probably there are cases when we
mark them when it is not required as discussed in this thread.

I think if we don't have any better ideas then we should go with
either this or one of the other proposals in this thread. The other
idea that occurred to me is whether we can somehow update the snapshot
we have serialized on disk about this information. On each
running_xact record when we serialize the snapshot, we also try to
purge the committed xacts (via SnapBuildPurgeCommittedTxn). So, during
that we can check if there are committed xacts to be purged and if we
have previously serialized the snapshot for the prior running xact
record, if so, we can update it with the list of xacts that have
catalog changes. If this is feasible then I think we need to somehow
remember the point where we last serialized the snapshot (maybe by
using builder->last_serialized_snapshot). Even, if this is feasible we
may not be able to do this in back-branches because of the disk-format
change required for this.

Thoughts?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2022-05-21 11:12:20 Re: definition of CalculateMaxmumSafeLSN
Previous Message Michael Paquier 2022-05-21 09:57:10 Re: PG15 beta1 fix pg_stat_statements view document