Re: Conflict detection for update_deleted in logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Conflict detection for update_deleted in logical replication
Date: 2025-05-15 12:32:40
Message-ID: CAA4eK1KemsW0EXaSy2Y-M-vVy5Gh4onNG++kKs7ugY+3N-g-Yw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 25, 2025 at 4:06 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> >
> > Here is V30 patch set includes the following changes:
> >
>
> Thank You for the patch, please find few concerns:
>
> 1)
> By looking at code of ApplyLauncherMain(), it appears that we stopped
> advancing shared-slot's xmin if any of the subscriptions with
> retain_conflict_info is disabled. If a subscription is not being used
> and is disabled forever (or for quite long), that means xmin will
> never be advanced and we will keep accumulating dead-rows even if
> other subscribers with retain_conflict_info are actively setting their
> oldest_xmin. This could be problematic. Here too, there should be some
> way to set stop-conflict-rettention for such a subscription like we do
> for 'subscriber not able to catch-up case'.
> But I understand it can be complex to implement as we do not know for
> how long a subscription is disabled. If we do not find a simpler way
> to implement it, then at least we can document such cases and the
> risks associated with disabled subscription which has
> 'retain_conflict_info' enabled. Thoughts?
>

Yeah, I agree that this is the case to be worried about but OTOH, in
such cases, even the corresponding logical_slot on the primary won't
be advanced and lead to bloat on the primary as well. So, we can
probably give WARNING when user tries to disable a subscription or
create a disabled subscription with retention flag true. We may need
to write a LOG or raise a WARNING when while exiting apply worker
disables the subscription due disable_on_error option. In addition to
that, we can even document this case. Will that address your concern?

Few minor comments on 0001:
1.
+ if (TimestampDifferenceExceeds(data->reply_time,
+ data->candidate_xid_time, 0))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("oldest_nonremovable_xid transaction ID may be advanced prematurely"),

Shouldn't this be elog as this is an internal message? And, instead of
"... may be ..", shall we use ".. could be .." in the error message as
the oldest_nonremovable_xid is not yet advanced by this time.

2.
+ * It's necessary to use FullTransactionId here to mitigate potential race
+ * conditions. Such scenarios might occur if the replication slot is not
+ * yet created by the launcher while the apply worker has already
+ * initialized this field.

IIRC, we discussed why it isn't easy to close this race condition. Can
we capture that in comments separately?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2025-05-15 12:48:58 Re: Restrict publishing of partitioned table with a foreign table as partition
Previous Message Fujii Masao 2025-05-15 11:30:14 Re: [BUG] Skipped initialization of some xl_xact_parsed_prepare fields