From: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
---|---|
To: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | RE: Conflict detection for update_deleted in logical replication |
Date: | 2025-06-06 09:47:49 |
Message-ID: | OS3PR01MB57180E5094DAE09037295EBA946EA@OS3PR01MB5718.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jun 6, 2025 at 1:49 PM shveta malik wrote:
>
> On Wed, Jun 4, 2025 at 4:12 PM Zhijie Hou (Fujitsu)
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > Here is the V33 patch set which includes the following changes:
> >
>
> please find few comments for patch003:
Thanks for the comments!
>
> 1)
> + /*
> + * Skip the track_commit_timestamp check by passing it as
> + * true, since it has already been validated during CREATE
> + * SUBSCRIPTION and ALTER SUBSCRIPTION SET commands.
> + */
> + CheckSubConflictInfoRetention(sub->retainconflictinfo,
> + true, opts.enabled);
> +
>
> Is there a special reason for disabling WARNING while enabling the
> subscription? If rci subscription was created in disabled state and
> track_commit_timestamp was enabled at that time, then there will be no
> WARNING. But while enabling the sub at a later stage, it may be
> possible that track_commit_timestamp is off but rci as ON.
I feel reporting a WARNING related to track_commit_timestamp during
subscription enable DDL is a bit unnatural, since it's not directly related to the
this DDL. Also, I think we do not intend to capture scenarios where
track_commit_timestamp is disabled afterwards.
>
> 2)
>
> * The worker has not yet started, so there is no valid
> * non-removable transaction ID available for advancement.
> */
> + if (sub->retainconflictinfo)
> + can_advance_xmin = false;
>
> Shall we change comment to:
> Only advance xmin when all workers for rci enabled subscriptions are
> up and running.
Adjusted according to your suggestion.
>
>
> 3)
>
> Assert(MyReplicationSlot);
> - Assert(TransactionIdIsValid(new_xmin));
> Assert(TransactionIdPrecedesOrEquals(MyReplicationSlot->data.xmin,
> new_xmin));
>
> + if (!TransactionIdIsValid(new_xmin))
> + return;
>
>
> a)
> Why have we replaced Assert with 'if' check? In which scenario do we
> expect new_xmin as Invalid here?
I think it's not needed now, so removed.
> 4)
> DisableSubscriptionAndExit:
> + /*
> + * Skip the track_commit_timestamp check by passing it as true, since it
> + * has already been validated during CREATE SUBSCRIPTION and ALTER
> + * SUBSCRIPTION SET commands.
> + */
> + CheckSubConflictInfoRetention(MySubscription->retainconflictinfo, true,
> + false);
>
> This comment makes sense during alter-sub enable, here shall we change it
> to:
> Skip the track_commit_timestamp check by passing it as true as it is
> not needed to be checked during subscription-disable.
Changed.
>
>
> 5)
> postgres=# alter subscription sub3 set (retain_conflict_info=false);
> ERROR: cannot set option retain_conflict_info for enabled subscription
>
> Do we need this restriction during disable of rci as well?
I prefer to maintain the restriction on both enabling and disabling operations
for the sake of simplicity, since the primary aim of this restriction is to
keep the logic straightforward and eliminate the need to think and address all
potential race conditions. I think restricting only the enable operation is
also OK and would not introducing new issues, but it might be more prudent to
keep things simple in the first version. Once the main patches stabilize, we
can consider easing or removing the entire restriction.
>
> 6)
> + <para>
> + If the <link
> linkend="sql-createsubscription-params-with-retain-conflict-info"><literal
> >retain_conflict_info</literal></link>
> + option is altered to <literal>false</literal> and no other subscription
> + has this option enabled, the additional replication slot that was created
> + to retain conflict information will be dropped.
> + </para>
>
> It will be good to mention the slot name as well.
Added.
>
>
> 7)
> + * Verify that the max_active_replication_origins and max_replication_slots
> + * configurations specified are enough for creating the subscriptions. This is
> + * required to create the replication origin and the conflict detection slot
> + * for each subscription.
> */
>
> We shall rephrase the comment, it gives the feeling that a 'conflict
> detection slot' is needed for each subscription.
Right, changed.
Here is the V34 patch set which includes the following changes:
0001:
* pgindent
0002:
* pgindent
0003:
* pgindent
* Addressed above comments from Shvete
* Improved the comments atop of the new restrcition.
* Ensured that the worker restarts when the retain_conflict_info was enabled
during startup regardless of the existence of the slot.
In V33, we relied on the existence of slot to decide whether the worker needs
to restart on startup option change. But we found that even if the slot
exists when launching the apply worker with(retain_conflict_info=false), the
slot could be removed soon by the launcher since the launcher might find
there is no subscription that enables retain_conflict_info. So the worker
could start to maintain the oldest_nonremovable_xid when the slot is not yet
created.
0004:
* pgindent
* Fixed some inaccurate and wrong descriptions in the document.
0005:
* pgindent
0006:
* pgindent
0007:
* pgindent
0008:
* A new patch to remove the restriction on altering retain_conflict_info when
the subscription is enabled, and resolves race condition issues caused by the
new option value being asynchronously acknowledged by the launcher and apply
workers. It changed the oldest_nonremovable_xid to FullTransactionID so that
even if the warparound happens, we can correctly identity if the transaction ID
a old or new one. Additioanly, it adds a safeguard check when advancing
slot.xmin to prevent backward movement.
The 0008 is kept as .txt to prevent the BF failure from testing it at this stage.
Best Regards,
Hou zj
Attachment | Content-Type | Size |
---|---|---|
v34-0002-Maintain-the-replication-slot-in-logical-launche.patch | application/octet-stream | 19.6 KB |
v34-0003-Add-a-retain_conflict_info-option-to-subscriptio.patch | application/octet-stream | 95.4 KB |
v34-0004-Introduce-a-new-GUC-max_conflict_retention_durat.patch | application/octet-stream | 27.8 KB |
v34-0005-Re-create-the-replication-slot-if-the-conflict-r.patch | application/octet-stream | 9.0 KB |
v34-0006-Add-a-tap-test-to-verify-the-management-of-the-n.patch | application/octet-stream | 7.2 KB |
v34-0007-Support-the-conflict-detection-for-update_delete.patch | application/octet-stream | 25.6 KB |
v34-0008-Allow-altering-retain_conflict_info-for-enabled-.patch.txt | text/plain | 26.7 KB |
v34-0001-Maintain-the-oldest-non-removable-transaction-ID.patch | application/octet-stream | 43.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | shveta malik | 2025-06-06 10:02:45 | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
Previous Message | Andy Fan | 2025-06-06 09:32:02 | Re: Some questions about gin index |