| From: | jihyun bahn <rring0727(at)gmail(dot)com> |
|---|---|
| To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Cc: | Mikhail Kharitonov <mikhail(dot)kharitonov(dot)dev(at)gmail(dot)com> |
| Subject: | Re: [PATCH] Fix replica identity mismatch for partitioned tables with publish_via_partition_root |
| Date: | 2026-06-17 05:15:55 |
| Message-ID: | 178167335561.992.15549275027332613587.pgcf@coridan.postgresql.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: not tested
Hi Mikhail, and Masahiko and Alexander,
I tested v3 of this patch on current master and looked into the
subscriber-side divergence Masahiko raised in January, since it
seemed to be the open question holding the thread up. Sharing what I
found, in case the data is useful.
Environment:
- PostgreSQL master (commit 68ace967c16)
- macOS 14 (Darwin 24.2) on Apple Silicon, Apple clang 16
- meson build with -Dcassert=true
v3-0001 applies cleanly and builds without new warnings.
What v3 changes, on the wire
----------------------------
Using Masahiko's setup -- a list-partitioned root p with REPLICA
IDENTITY FULL, leaf c1 FULL and leaf c2 USING INDEX, published with
publish_via_partition_root = true -- I dumped the DELETE of the c2 row
with pg_logical_slot_peek_binary_changes (pgoutput, proto_version 4):
before v3: D ... O t "2" n -- tag 'O', column b sent as NULL
after v3: D ... K t "2" n -- tag 'K', column b sent as NULL
So v3 correctly changes the tag from 'O' to 'K' for the leaf whose
replica identity is index-based, which is what it documents. The
old-tuple payload itself is unchanged (b is still NULL).
What does not change, on the subscriber
---------------------------------------
The apply worker reads the 'O'/'K' tag only to decide whether an old
tuple follows; it does not branch on the tag afterwards. So native
PostgreSQL-to-PostgreSQL apply behaves identically with and without
v3. I confirmed this by varying only whether the subscriber table
has a key index:
subscriber p(a, b), no PK/index:
DELETE FROM p WHERE a = 2; -- not applied; (2,20) remains
LOG: conflict detected on relation "public.p":
conflict=delete_missing
DETAIL: Could not find the row to be deleted:
replica identity full (2, null).
subscriber p(a int primary key, b):
DELETE FROM p WHERE a = 2; -- applied correctly, row removed
This holds both before and after v3. So the divergence Masahiko saw
reproduces only when the subscriber lacks a replica-identity/PK index
on the key column; with such an index the key-only old tuple matches
via the index path and applies fine. The divergence is governed by
the subscriber's index, not by the 'O'/'K' tag, which is why v3
(correctly) does not change it.
That suggests v3's effect is precisely scoped to protocol conformance
for consumers that trust the tag (external CDC), which is what it
claims. It might be worth saying so explicitly in the commit message,
to keep it separate from the native-divergence question, which it does
not address.
On the deeper issue
-------------------
The native divergence comes from the old tuple carrying a NULL for a
column that the subscriber -- told the relation is REPLICA IDENTITY
FULL, from the root -- then compares as a real value. I tried a small
experiment: a new per-column wire marker meaning "this column is not
part of the replica identity and was not sent", emitted under a new
protocol version for the columns outside the leaf's replica identity.
The subscriber excludes those columns from old-tuple matching, reusing
the same per-column exclusion that RelationFindDeletedTupleInfoSeq()
already does for conflict detection. With that, the no-index
subscriber above matches the c2 DELETE by key and the divergence
disappears, while older subscribers fall back to today's behavior. It
does not touch the Relation message.
I am not proposing that here -- it is a rough prototype and larger
than this patch's scope. But I wanted to ask: is a direction
like that of interest as a separate patch, or is a root/leaf
replica-identity mismatch under publish_via_partition_root considered a
misconfiguration that should instead be warned about at publication
time? I could not find the heterogeneous-RI case settled either way in
the archives -- a 2019 thread from Alvaro, "propagating replica
identity to partitions", explored making it propagate, but as far as I
can tell that did not end in a committed change (replica identity still
does not cascade to partitions today). So I am unsure which way the
project leans.
Separately, on Alexander's point that the test only creates a
subscriber and waits for sync: I would be happy to help extend the TAP
test to assert the subscriber's final table state, if that is useful.
Regards,
Jihyun Bahn
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2026-06-17 05:42:21 | Re: faulty error handling around pgstat_count_io_op_time() |
| Previous Message | Michael Paquier | 2026-06-17 05:14:48 | Re: Fix DROP PROPERTY GRAPH "unsupported object class" error |