| From: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
|---|---|
| To: | Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shvetamalik(at)gmail(dot)com> |
| Subject: | Re: Improve conflict detection when replication origins are reused |
| Date: | 2026-05-15 09:56:58 |
| Message-ID: | CAJpy0uAYRCbVN_QGYrF_u++wdC4oej_w1S3A_BuWtB3Gct7mhw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Fri, May 15, 2026 at 8:56 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Thu, May 14, 2026 at 8:35 AM Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> wrote:
> >
> > Hi hackers,
> >
> > While reviewing the issue reported at [1] and the proposed solutions
> > at [2], I noticed a related problem: false negative conflict detection
> > when a 'ReplOriginId' gets reused.
> >
> > In logical replication, conflict detection relies on the tuple’s
> > replication origin ('roident'). The problem is that if a subscription
> > is dropped and a new subscription later reuses the same origin ID, the
> > apply worker may incorrectly treat incoming changes as “its own”
> > changes and skip conflict detection.
> >
> > A simple example:
> > 1. Create subscription sub1 with 'roident = 1'
> > 2. Replicate some rows into table 't1'
> > 3. Drop 'sub1'
> > 4. Create another subscription 'sub2'
> > 5. `sub2` reuses 'roident = 1'
> > 6. New updates arrive for rows previously written by 'sub1'
> > At this point, conflict detection sees:
> > tuple_origin == current_origin
> >
> > and incorrectly assumes the row was written by the current
> > subscription instance, so no 'update_origin_differ' conflict is
> > raised.
>
> I agree with the problem sattement. I will prioritize the review soon.
>
> > This may look harmless in this simple setup, but it becomes
> > problematic if the new subscription is connected to a different
> > publisher, because real conflicts can then be silently missed.
> >
> > I explored two possible approaches to solve this:
> >
> > Approach 1. Zero out old origin IDs in commit_ts data when dropping a
> > subscription
> > ----------------------
> > - When a subscription is dropped and its replication origin becomes
> > free, scan all 'commit_ts' SLRU entries and replace that old origin ID
> > with 'InvalidRepOriginId (0)'.
> > - So rows previously written by the old subscription would no longer
> > appear to belong to any active replication origin.
> > - A new subscription reusing the same 'roident' will always conflict
> > with origin '0'.
> >
> > Pros:
> > - Fixes the stale-origin problem completely and may also help solve
> > the tablesync-origin issue discussed in [1]
> > - No additional checks needed during conflict detection
> >
> > Cons:
> > - Requires scanning the entire 'commit_ts' SLRU during DROP
> > SUBSCRIPTION, so it can become very expensive on large systems
> > - Not crash-safe currently(patch):
> > - if the server crashes midway, some entries may still contain the
> > old origin ID
> > - after restart, reused origins can again lead to missed conflicts
> > - Making this fully crash-safe would likely require WAL logging or
> > recovery-time reprocessing.
> >
> > Approach 2. Store replication origin creation time
> > ----------------------
> > - Add a creation timestamp for each replication origin
> > - During conflict check:
> > if tuple_origin != current_origin
> > -> existing behavior
> > if tuple_origin == current_origin
> > -> compare tuple commit timestamp with origin creation time
> > if tuple_commit_ts <= origin_creation_time
> > -> treat as an origin reuse case and raise conflict
> >
> > Pros:
> > -------
> > - No additional processing during DROP SUBSCRIPTION
> > - Lightweight runtime check (just one timestamp comparison)
> > - Naturally crash-safe since origin creation is WAL-logged already
> >
> > Cons:
> > - Requires a catalog schema change
> > - The <= comparison can produce false-positive conflicts for rows
> > committed at the exact same microsecond as origin creation
> > - May require additional handling for upgraded origins
> >
> > IMO, the second approach currently looks more practical because it
> > avoids the heavy SLRU scan and crash-recovery complexity.
> >
> > Attached:
> > - Patch for approach 1
> > - Patch for approach 2
> > - A TAP test reproducing the issue
> >
> > Note: The patches are manually tested for the reported issue, but not
> > yet tested for performance or additional edge cases.
> >
> > Feedback and suggestions are welcome.
> >
> > [1] https://www.postgresql.org/message-id/CALDaNm3Y6Y4Mub6QC8fZKnNy5jZspELQYCoQF_FL2Zwzweu%3Dog%40mail.gmail.com
> > [2] https://www.postgresql.org/message-id/CAA4eK1LxGXR7jOAKh0B8N362S-Q3b6GhBxxcV_HxUaicEPq5Cg%40mail.gmail.com
> >
> > --
Nisha, I think we will get the same problem in another scenario too:
create pub1-server1
create pub1-server2
create sub1-server3; subscribing to pub1-server1
--On both server1 and server2, insert same set of rows:
insert into tab1 values (10), (20), (30);
Sub1 (server3) will get the rows from server1.
Now alter sub1 to connect to server2 (you will have to create slot
manually on server2)
SELECT pg_create_logical_replication_slot('sub1', 'pgoutput', false,
false, false);
--Now perform the update on server2:
update tab1 set i=11 where i=10;
The subscriber on server3 will receive update form server2 and will
update the row inserted by server1 origianlly without raising
update_origin_differ.
Can you please confirm if my understanding of the problem statement is
correct and if the scenario above will also result in a similar
situation? IIUC, in such a case, the proposed solutions may not work
directly and will need to be further evolved. I will think more once
you confirm my understanding.
thanks
Shveta
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Nisha Moond | 2026-05-15 10:29:23 | Re: Proposal: Conflict log history table for Logical Replication |
| Previous Message | Zsolt Parragi | 2026-05-15 09:50:44 | Re: Bug? pg_rewind produces unusable but starting database with standby recovery |