Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

From: "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, <osumi(dot)takamichi(at)fujitsu(dot)com>, <sawada(dot)mshk(at)gmail(dot)com>
Cc: <pgsql-hackers(at)lists(dot)postgresql(dot)org>, <minsoo(at)amazon(dot)com>
Subject: Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Date: 2022-02-21 10:29:51
Message-ID: 25c03b0c-1684-fc66-6aef-0ab41d824510@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 10/19/21 8:43 AM, Kyotaro Horiguchi wrote:
> At Tue, 19 Oct 2021 02:45:24 +0000, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com> wrote in
>> On Thursday, October 14, 2021 11:21 AM Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:
>>> It adds a call to ReorderBufferAssignChild but usually subtransactions are
>>> assigned to top level elsewherae. Addition to that
>>> ReorderBufferCommitChild() called just later does the same thing. We are
>>> adding the third call to the same function, which looks a bit odd.
>> It can be odd. However, we
>> have a check at the top of ReorderBufferAssignChild
>> to judge if the sub transaction is already associated or not
>> and skip the processings if it is.
> My question was why do we need to make the extra call to
> ReorerBufferCommitChild when XACT_XINFO_HAS_INVALS in spite of the
> existing call to the same fuction that unconditionally made. It
> doesn't cost so much but also it's not free.
>
>>> And I'm not sure it is wise to mark all subtransactions as "catalog changed"
>>> always when the top transaction is XACT_XINFO_HAS_INVALS.
>> In order to avoid this,
>> can't we have a new flag (for example, in reorderbuffer struct) to check
>> if we start decoding from RUNNING_XACTS, which is similar to the first patch of [1]
>> and use it at DecodeCommit ? This still leads to some extra specific codes added
>> to DecodeCommit and this solution becomes a bit similar to other previous patches though.
> If it is somehow wrong in any sense that we add subtransactions in
> SnapBuildProcessRunningXacts (for example, we should avoid relaxing
> the assertion condition.), I think we would go another way. Otherwise
> we don't even need that additional flag. (But Sawadasan's recent PoC
> also needs that relaxation.)
>
> ASAICS, and unless I'm missing something (that odds are rtlatively
> high:p), we need the specially added subransactions only for the
> transactions that were running at passing the first RUNNING_XACTS,
> becuase otherwise (substantial) subtransactions are assigned to
> toplevel by the first record of the subtransaction.
>
> Before reaching consistency, DecodeCommit feeds the subtransactions to
> ReorderBufferForget individually so the subtransactions are not needed
> to be assigned to the top transaction at all. Since the
> subtransactions added by the first RUNNING_XACT are processed that
> way, we don't need in the first place to call ReorderBufferCommitChild
> for such subtransactions.
>
>> [1] - https://www.postgresql.org/message-id/81D0D8B0-E7C4-4999-B616-1E5004DBDCD2%40amazon.com
> regards.
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>
Just rebased (minor change in the contrib/test_decoding/Makefile) the
last POC version linked to the CF entry as it was failing the CF bot.

Thanks

Bertrand

Attachment Content-Type Size
poc_mark_all_txns_catalog_change.patch text/plain 5.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2022-02-21 10:42:07 Re: pgcrypto: Remove internal padding implementation
Previous Message wangsh.fnst@fujitsu.com 2022-02-21 10:18:11 show schema.collate in explain(verbose on)