Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

From: "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Oh, Mike" <minsoo(at)amazon(dot)com>
Subject: Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Date: 2021-09-23 08:44:16
Message-ID: 6a176936-1af1-41d9-9ded-bde0eb47dd5c@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 7/29/21 10:25 AM, Masahiko Sawada wrote:
> Thank you for reporting the issue.
>
> I could reproduce this issue by the steps you shared.

Thanks for looking at it!

>
>> Currently, the system relies on processing Heap2::NEW_CID to keep track of catalog modifying (sub)transactions.
>>
>> This context is lost if the logical decoding has to restart from a Standby::RUNNING_XACTS that is written between the NEW_CID record and its parent txn commit.
>>
>> If the logical stream restarts from this restart_lsn, then it doesn't have the xid responsible for modifying the catalog.
>>
> I agree with your analysis. Since we don’t use commit WAL record to
> track the transaction that has modified system catalogs, if we decode
> only the commit record of such transaction, we cannot know the
> transaction has been modified system catalogs, resulting in the
> subsequent transaction scans system catalog with the wrong snapshot.
Right.
>
> With the patch, if the commit WAL record has a XACT_XINFO_HAS_INVALS
> flag and its xid is included in RUNNING_XACT record written at
> restart_lsn, we forcibly add the top XID and its sub XIDs as a
> committed transaction that has modified system catalogs to the
> snapshot. I might be missing something about your patch but I have
> some comments on this approach:
>
> 1. Commit WAL record may not have invalidation message for system
> catalogs (e.g., when commit record has only invalidation message for
> relcache) even if it has XACT_XINFO_HAS_INVALS flag.

Right, good point (create policy for example would lead to an
invalidation for relcache only).

> In this case, the
> transaction wrongly is added to the snapshot, is that okay?
This transaction is a committed one, and IIUC the snapshot would be used
only for catalog visibility, so i don't see any issue to add it in the
snapshot, what do you think?
>
> 2. We might add a subtransaction XID as a committed transaction that
> has modified system catalogs even if it actually didn't.

Right, like when needs_timetravel is true.

> As the
> comment in SnapBuildBuildSnapshot() describes, we track only the
> transactions that have modified the system catalog and store in the
> snapshot (in the ‘xip' array). The patch could break that assumption.
Right. It looks to me that breaking this assumption is not an issue.

IIUC currently the committed ones that are not modifying the catalog are
not stored "just" because we don't need them.
> However, I’m really not sure how to deal with this point. We cannot
> know which subtransaction has actually modified system catalogs by
> using only the commit WAL record.
Right, unless we rewrite this patch so that a commit WAL record will
produce this information.
>
> 3. The patch covers only the case where the restart_lsn exactly
> matches the LSN of RUNNING_XACT.
Right.
> I wonder if there could be a case
> where the decoding starts at a WAL record other than RUNNING_XACT but
> the next WAL record is RUNNING_XACT.

Not sure, but could a restart_lsn not be a RUNNING_XACTS?

Thanks

Bertrand

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Juan José Santamaría Flecha 2021-09-23 09:05:47 Re: stat() vs ERROR_DELETE_PENDING, round N + 1
Previous Message Aleksander Alekseev 2021-09-23 08:13:43 Re: mark the timestamptz variant of date_bin() as stable