RE: Conflict detection for update_deleted in logical replication

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Conflict detection for update_deleted in logical replication
Date: 2025-06-17 10:56:23
Message-ID: OS0PR01MB57168D21EFBBCF6B89F315CF9473A@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 16, 2025 at 7:37 PM Amit Kapila wrote:
>
> On Thu, Jun 12, 2025 at 11:34 AM Zhijie Hou (Fujitsu)
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
>
> Few comments on v36 patches:
> ==========================
> 1. In advance_conflict_slot_xmin(), we first save the slot to disk,
> then update its effective xmin, and then do the required xmin
> computation. Now, if we don't save the slot every time, there is a
> risk that its value can go backwards after a restart. But OTOH, for
> physical slots maintained by walsender for physical replication, we
> also don't save the physical slot. However, still the system works, see discussion in email: [1].
>
> As per my understanding, even if the conflict_slot's xmin moved back
> after restart, it shouldn't cause any problem. Because it will anyway
> be moved ahead in the next cycle, and there won't be any rows that
> will get removed but are required for conflict detection. If this is
> correct, then we don't need to save the slot in advance_conflict_slot_xmin().

I think you are right, it should be OK to avoid saving the slot each time.
I have changed the patch accordingly.

>
> 2.
> + *
> + * Issue a warning if track_commit_timestamp is not enabled when
> + * check_commit_ts is set to true.
> + *
> + * Issue a warning if the subscription is being disabled.
> + */
> +void
> +CheckSubConflictInfoRetention(bool retain_conflict_info, bool
> check_commit_ts,
> + bool disabling_sub)
> +{
> + if (!retain_conflict_info)
> + return;
> +
> + if (check_commit_ts && !track_commit_timestamp) ereport(WARNING,
> + errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("commit timestamp and origin data required for detecting
> conflicts won't be retained"),
> + errhint("Consider setting \"%s\" to true.",
> + "track_commit_timestamp"));
> +
> + if (disabling_sub)
> + ereport(WARNING,
> + errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("deleted rows to detect conflicts would not be removed until
> the subscription is enabled"),
> + errhint("Consider setting %s to false.", "retain_conflict_info"));
>
> The quoted comments atop this function just say what it is apparent
> from the code. It is better if the comments explain why we allow to
> proceed when the above conditions are not met.

Added.

>
> I think we can probably add a check here that this option requires
> wal_level = replica as the launcher needs to create a physical slot to
> retain the required info.

Added.

After adding this ERROR, I removed the create sub (retain_conflict_info=true)
test in the subscription.sql. This is because it could cause the regression
test in 002_pg_upgrade.pl to fail as the wal_level is set to minimal in that
tap-test. I didn't add new tests since we already have tests for sub creation
in other tap tests(004_subscription.pl, 035_conflicts.pl).

Note that, the 004_subscription.pl would fail due to a crash related to commit
ca307d5. And the same has also been noticed by BF, reported in [1]. Please
ignore this error temporarily.

>
> 3. Isn't the new check for logical slots in
> check_new_cluster_subscription_configuration() somewhat redundant with
> the previous check done in check_new_cluster_logical_replication_slots()?
> Can't we combine both?

Merged as suggested.

>
> Apart from this, I have made a number of changes in the comments and a
> few other cosmetic changes in the attached.

Thanks, merged.

Here is the V38 patch set which includes the following changes:

0001:
* Addressed the comments above.
* Addressed Shveta's comments[2].

0002:
* Addressed the comments above.
* Added the document to explain that the commit timestamp data
would not be preserved during upgrade.

0003:
Rebased

0004:
Rebased

0005:
Rebased

0006:
Rebased

0007:
Rebased

[1] https://www.postgresql.org/message-id/CALDaNm3s-jpQTe1MshsvQ8GO%3DTLj233JCdkQ7uZ6pwqRVpxAdw%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAJpy0uDJ3ofFk4FFWCf6hLQZaPb%3Dry45906pqcQZ%2Bg-p3C%3D_JA%40mail.gmail.com

Best Regards,
Hou zj

Attachment Content-Type Size
v38-0002-Add-a-retain_conflict_info-option-to-subscriptio.patch application/octet-stream 100.6 KB
v38-0006-Support-the-conflict-detection-for-update_delete.patch application/octet-stream 25.5 KB
v38-0001-Retain-the-information-useful-for-detecting-conf.patch application/octet-stream 62.9 KB
v38-0003-Introduce-a-new-GUC-max_conflict_retention_durat.patch application/octet-stream 29.4 KB
v38-0004-Re-create-the-replication-slot-if-the-conflict-r.patch application/octet-stream 7.0 KB
v38-0005-Add-a-tap-test-to-verify-the-management-of-the-n.patch application/octet-stream 8.0 KB
v38-0007-Allow-altering-retain_conflict_info-for-enabled-.patch application/octet-stream 28.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2025-06-17 11:22:25 Re: Transactional behavior of pg_create_logical_replication_slot
Previous Message Konstantin Knizhnik 2025-06-17 10:30:22 Re: Non-reproducible AIO failure