RE: Conflict detection for update_deleted in logical replication

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Conflict detection for update_deleted in logical replication
Date: 2025-05-22 02:57:57
Message-ID: OS0PR01MB5716B7983231B9DF3713EF2A9499A@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 20, 2025 at 7:41 PM Amit Kapila wrote:

>
> On Fri, May 16, 2025 at 5:0 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> >
>
> Please find some more comments on the 0001 patch:

Thanks for the comments!

> 1.
> We need to know about such transactions
> + * for conflict detection and resolution in logical replication. See
> + * GetOldestTransactionIdInCommit and its use.
>
> Do we need to mention resolution in the above sentence? This patch is
> all about detecting conflict reliably.

I think it’s not needed. Removed.

>
> 2. In wait_for_publisher_status(), we use remote_epoch,
> remote_nextxid, and remote_oldestxid to compute full transaction id's.
> Why can't we send FullTransactionIds for remote_oldestxid and
> remote_nextxid from publisher? If these are required, maybe a comment
> somewhere for that would be good.

It was intended to follow the existing "hot standby feedback" message, but
since in our case we do not need to use the epoch and xid separately, so I
changed it to FullTransactionId in this version.

>
> 3.
> /*
> + * Note it is important to set committs value after marking ourselves
> + as
> + * in the commit critical section (DELAY_CHKPT_IN_COMMIT). This is
> + because
> + * we want to ensure all such transactions are finished before we
> + allow
> + * the logical replication client to advance its xid which is used to
> + hold
> + * back dead rows for conflict detection. See
> + * maybe_advance_nonremovable_xid.
> + */
> + committs = GetCurrentTimestamp();
>
> How does setting committs after setting DELAY_CHKPT_IN_COMMIT help in
> advancing client-side xid? IIUC, on client-side, we simply wait for
> such an xid to be finished based on the remote_oldestxid and
> remote_nextxid sent via the server. So, the above comment is not
> completely clear to me. I have updated this and related comments in
> the attached diff patch to make it clear. See if that makes sense to you.

It looks good to me. Merged.

>
> 4.
> In 0001's commit message, we have: "Furthermore, the preserved commit
> timestamps and origin data are essential for consistently detecting
> update_origin_differs conflicts." But it is not clarified how this
> patch helps in consistently detecting update_origin_differs conflict?

The reason is the replication slot can also prevent the origin data from being
removed. The origin data could be removed by vacuum freeze, which could also be
blocked due to the replication slot.xmin. I have added it the commit message.

Attaching the V31 patch set which addressed comments in [1]~[8].

The comments in [9] concerning the new GUC in patch 0004 is still under review
and will be addressed in the next version.

[1]https://www.postgresql.org/message-id/CAJpy0uD6SgD7w839Wzezdj0JT2OnA%2BxCxddM15%3Dgb5nRqYAv%2BA%40mail.gmail.com
[2]https://www.postgresql.org/message-id/CAJpy0uCYqG16zCjiCK4og6yqR7zP2rB1oOT7%3DAnDdVePo-8RrA%40mail.gmail.com
[3]https://www.postgresql.org/message-id/CAA4eK1KemsW0EXaSy2Y-M-vVy5Gh4onNG%2B%2BkKs7ugY%2B3N-g-Yw%40mail.gmail.com
[4]https://www.postgresql.org/message-id/CAA4eK1%2Br9V6DpH9gYRa2xOx167FapbuKdc4gESr8Etxpx2zrqw%40mail.gmail.com
[5]https://www.postgresql.org/message-id/CAJpy0uArh0A9yOxoamD0RWM-7K9kyoUMNh5C2%2BPFTbGFoxf5wg%40mail.gmail.com
[6]https://www.postgresql.org/message-id/CAJpy0uDL4oLdhYup44a2%3D1OeyUSsKhg8Y30-h1uxcf%3Dmki57uA%40mail.gmail.com
[7]https://www.postgresql.org/message-id/CAA4eK1%2BVNaGi-GU6awgFKmTgidLTHo2HDuzV1%2BaT8sjn8QtPxg%40mail.gmail.com
[8]https://www.postgresql.org/message-id/CAA4eK1%2B%3DZAf0T2iMg2%2BZF4cJdUk%3DUViqpiOg_kPa8jgK%2Bg94aw%40mail.gmail.com
[9]https://www.postgresql.org/message-id/CAA4eK1LLaXzsKOaPwGTiikOYySYK%2BTy_x3EXg-g%3D7M_CLn4WiQ%40mail.gmail.com

Best Regards,
Hou zj

Attachment Content-Type Size
v31-0007-Support-the-conflict-detection-for-update_delete.patch application/octet-stream 23.7 KB
v31-0001-Maintain-the-oldest-non-removeable-tranasction-I.patch application/octet-stream 43.6 KB
v31-0002-Maintain-the-replication-slot-in-logical-launche.patch application/octet-stream 20.2 KB
v31-0003-Add-a-retain_conflict_info-option-to-subscriptio.patch application/octet-stream 87.7 KB
v31-0004-Introduce-a-new-GUC-max_conflict_retention_durat.patch application/octet-stream 29.7 KB
v31-0005-Re-create-the-replication-slot-if-the-conflict-r.patch application/octet-stream 10.7 KB
v31-0006-Add-a-tap-test-to-verify-the-management-of-the-n.patch application/octet-stream 6.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2025-05-22 02:58:42 RE: Conflict detection for update_deleted in logical replication
Previous Message David Rowley 2025-05-22 02:36:38 Re: Add comment explaining why queryid is int64 in pg_stat_statements