| From: | cca5507 <cca5507(at)qq(dot)com> |
|---|---|
| To: | pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | [BUG] Incorrect historic snapshot may be serialized to disk during fast-forwarding |
| Date: | 2025-11-22 08:55:05 |
| Message-ID: | tencent_3A071B760AA1A38540B57F297332B7781C08@qq.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
When working on another historic snapshot's bug in [1], I find the $subject.
Here is a test case, but we need to add some log in SnapBuildSerialize() first:
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 6e18baa33cb..6d13b2d811b 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1523,6 +1523,19 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn)
/* consistent snapshots have no next phase */
Assert(builder->next_phase_at == InvalidTransactionId);
+ StringInfoData logbuf;
+ initStringInfo(&logbuf);
+ appendStringInfo(&logbuf, "SnapBuildSerialize: lsn: %X/%08X xmin: %u, xmax: %u, committed: ",
+ LSN_FORMAT_ARGS(lsn), builder->xmin, builder->xmax);
+ for (size_t i = 0; i < builder->committed.xcnt; i++)
+ {
+ if (i > 0)
+ appendStringInfoString(&logbuf, ", ");
+ appendStringInfo(&logbuf, "%u", builder->committed.xip[i]);
+ }
+ elog(LOG, "%s", logbuf.data);
+ pfree(logbuf.data);
+
/*
* We identify snapshots by the LSN they are valid for. We don't need to
* include timelines in the name as each LSN maps to exactly one timeline
1) create table t (id int) with (user_catalog_table = true);
2) select pg_create_logical_replication_slot('s1', 'test_decoding');
3) select pg_create_logical_replication_slot('s2', 'test_decoding');
4) insert into t values (1);
5) select pg_replication_slot_advance('s1', pg_current_wal_lsn());
6) select pg_logical_slot_get_changes('s2', pg_current_wal_lsn(), null);
Then we will find some log like this:
LOG: SnapBuildSerialize: lsn: 0/017D1318 xmin: 768, xmax: 768, committed:
STATEMENT: select pg_replication_slot_advance('s1', pg_current_wal_lsn());
LOG: SnapBuildSerialize: lsn: 0/017D1318 xmin: 768, xmax: 769, committed: 768
STATEMENT: select pg_logical_slot_get_changes('s2', pg_current_wal_lsn(), null);
At the same lsn, we get two different historic snapshots, and the first one (which is incorrect) is serialized to disk.
The main reason is that we don't handle XLOG_HEAP2_NEW_CID during fast-forwarding, so we don't consider the insert as having a catalog change.
Attach a patch to fix it.
Looking forward to your reply.
[1]
https://www.postgresql.org/message-id/tencent_21E152AD504A814C071EDF41A4DD7BA84D06%40qq.com
--
Regards,
ChangAo Chen
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0001-Handle-XLOG_HEAP2_NEW_CID-in-heap2_decode-even-if.patch | application/octet-stream | 963 bytes |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Eisentraut | 2025-11-22 08:55:42 | Re: SQL:2011 Application Time Update & Delete |
| Previous Message | Thomas Munro | 2025-11-22 08:54:26 | Re: headerscheck ccache support |