RE: Conflict detection for update_deleted in logical replication

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: RE: Conflict detection for update_deleted in logical replication
Date: 2025-07-25 11:08:42
Message-ID: OS0PR01MB5716B80417A554E1030276909459A@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, July 25, 2025 2:33 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Jul 23, 2025 at 12:53 PM Zhijie Hou (Fujitsu)
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > Thanks for pushing. I have rebased the remaining patches.
> >
>
> + * This function performs a full table scan instead of using indexes
> + because
> + * index scans could miss deleted tuples if an index has been
> + re-indexed or
> + * re-created during change applications.
>
> IIUC, once the tuple is not found during update, the patch does an additional
> scan with SnapshotAny to find the DEAD tuple, so that it can report
> update_deleted conflict, if one is found. The reason in the comments to do
> sequential scan in such cases sound reasonable but I was thinking if we can
> do index scans if the pg_conflict_* slot's xmin is ahead of the RI (or any usable
> index that can be used during scan) index_tuple's xmin? Note, we use a similar
> check with the indcheckxmin parameter in pg_index though the purpose of
> that is different. If this can happen then still in most cases the index scan will
> happen.

Right, I think it makes sense to do with the index scan when the index's xmin is
less than the conflict detection xmin, as that can ensure that all the tuples
deleted before the index creation or re-indexing are irrelevant for conflict
detection.

I have implemented in the V53 patch set and improved the test to verify both
index and seq scan for dead tuples.

The V53-0001 also includes Shveta's comments in [1].

Apart from above issue,
I'd like to clarify why we scan all matching dead tuples in the relation to
find the most recently deleted one in the patch, and I will share an example for
the same.

The main reason is that only the latest deletion information is relevant for
resolving conflicts. If the first tuple retrieved is antiquated while a newer
deleted tuple exists, users may incorrectly resolve the remote change by
applying a last-update-win strategy. Here is an example:

1. In a BI-cluster setup, if both nodes initially contain empty data:

Node A: tbl (empty)
Node B: tbl (empty)

2. Then if user do the following operations on Node A and wait for them to be
replicated to Node B:

INSERT (pk, 1)
DELETE (pk, 1) @9:00
INSERT (pk, 1)

The data on both nodes looks like:

Node A: tbl (pk, 1) - live tuple
(pk, 1) - dead tuple - @9:00
Node B: tbl (pk, 1) - live tuple
(pk, 1) - dead tuple - @9:00

3. If a user do DELETE (pk) on Node B @9:02, and do UDPATE (pk, 1)->(pk, 2) on Node A
@9:01.

When applying the UPDATE on Node B, it cannot find the target tuple, so will
search the dead tuples, but there are two dead tuples:

Node B: tbl (pk, 1) - live tuple
(pk, 1) - dead tuple - @9:00
(pk, 1) - dead tuple - @9:02

If we only fetch the first tuple in the scan, it could be either a) the tuple
deleted @9:00 which is older than the remote UPDATE, or b) the tuple deleted
@9:02, which is newer than the remote UPDATE is @9:01. User may choose to apply
the UPDATE for case a) which can cause data inconsistency between nodes
(using last-update-win strategy).

Ideally, we should give the resolve the new dead tuple @9:02, so the resolver
can choose to ignore the remote UDPATE, keeping the data consistent.

[1] https://www.postgresql.org/message-id/CAJpy0uDiyjDzLU-%3DNGO7PnXB4OLy4%2BRxJiAySdw%3Da%2BYO62JO2g%40mail.gmail.com

Best Regards,
Hou zj

Attachment Content-Type Size
v53-0003-Re-create-the-replication-slot-if-the-conflict-r.patch application/octet-stream 8.5 KB
v53-0001-Support-the-conflict-detection-for-update_delete.patch application/octet-stream 39.4 KB
v53-0002-Introduce-a-new-GUC-max_conflict_retention_durat.patch application/octet-stream 31.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2025-07-25 11:08:46 RE: Conflict detection for update_deleted in logical replication
Previous Message Ashutosh Bapat 2025-07-25 10:59:00 Re: Document transition table triggers are not allowed on views/foreign tables