[BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

From: "Oh, Mike" <minsoo(at)amazon(dot)com>
To: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>
Subject: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Date: 2021-03-15 16:34:56
Message-ID: 81D0D8B0-E7C4-4999-B616-1E5004DBDCD2@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sending this to pgsql-hackers list to create a CommitFest entry with the attached patch proposal.

Hello,
We noticed that the logical replication could fail when the Standby::RUNNING_XACT record is generated in the middle of a catalog modifying transaction and if the logical decoding has to restart from the RUNNING_XACT
WAL entry.
The Standby::RUNNING_XACT record is generated periodically (roughly every 15s by default) or during a CHECKPOINT operation.

Detailed problem description:
Tested on 11.8 & current master.
The logical replication slot restart_lsn advances in the middle of an open txn that modified the catalog (e.g. TRUNCATE operation).
Should the logical decoding has to restart it could fail with an error like this:
ERROR: could not map filenode "base/13237/442428"

Currently, the system relies on processing Heap2::NEW_CID to keep track of catalog modifying (sub)transactions.
This context is lost if the logical decoding has to restart from a Standby::RUNNING_XACTS that is written between the NEW_CID record and its parent txn commit.
If the logical stream restarts from this restart_lsn, then it doesn't have the xid responsible for modifying the catalog.

Repro steps:
1. We need to generate the Standby::RUNNING_XACT record deterministically using CHECKPOINT. Hence we'll delay the LOG_SNAPSHOT_INTERVAL_MS using the following patch:
diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c
index 3e6ffb05b9..b776e8d566 100644
--- a/src/backend/postmaster/bgwriter.c
+++ b/src/backend/postmaster/bgwriter.c
@@ -76,7 +76,7 @@ int BgWriterDelay = 200;

* Interval in which standby snapshots are logged into the WAL stream, in
* milliseconds.

/
-#define LOG_SNAPSHOT_INTERVAL_MS 15000
+#define LOG_SNAPSHOT_INTERVAL_MS 1500000
2. Create a table
postgres=# create table bdt (a int);
CREATE TABLE
3. Create a logical replication slot:
postgres=# select pg_create_logical_replication_slot('bdt_slot','test_decoding');
pg_create_logical_replication_slot
------------------------------------
(bdt_slot,0/FFAA1C70)
(1 row)
4. Start reading the slot in a shell (keep the shell so that we can stop reading later):
./bin/pg_recvlogical --slot bdt_slot --start -f bdt.out -d postgres
5. Execute the workload across 2 different clients in the following order
Session1:
begin;
savepoint b1;
truncate bdt;

Session2:
select * from pg_replication_slots; /* keep note of the confirmed_flush_lsn */
checkpoint;
/* Repeat the following query until the confirmed_flush_lsn changes */
select * from pg_replication_slots;

Once confirmed_flush_lsn, changes:
Session1:
end;
begin;
insert into bdt values (1);
Session2:
select * from pg_replication_slots; /* keep note of both restart_lsn AND the confirmed_flush_lsn */
checkpoint;
/* Repeat the following query until both restart_lsn AND confirmed_flush_lsn change */
select * from pg_replication_slots;
6. Stop the pg_recvlogical (Control-C)
7. Then commit the insert txn:
Session1:
end;
8. Get/peek the replication slot changes
postgres=# select * from pg_logical_slot_get_changes('bdt_slot', null, null);
ERROR: could not map filenode "base/13237/442428" to relation OID

Proposed solution:
If we’re decoding a catalog modifying commit record, then check whether it’s part of the RUNNING_XACT xid’s processed @ the restart_lsn. If so, then add its xid & subxacts in the committed txns list in the logical decoding snapshot.

Please refer to the attachment for the proposed patch.

Thanks,
Mike

Attachment Content-Type Size
logical_decoding_xact_bookkeep.patch application/octet-stream 5.8 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2021-03-15 16:43:36 Re: Parser Hook
Previous Message Bossart, Nathan 2021-03-15 16:34:29 Re: archive status ".ready" files may be created too early