From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
Subject: | Re: Conflict detection for update_deleted in logical replication |
Date: | 2025-07-18 21:10:50 |
Message-ID: | CAD21AoA4VBePo0CCV8t8TL6cFxx2dEVuo3UhjW8+t3KU2W1xqA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jul 7, 2025 at 3:31 AM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Mon, Jul 7, 2025 at 10:13 AM Zhijie Hou (Fujitsu) wrote:
> >
> > On Sun, Jul 6, 2025 at 10:51 PM Masahiko Sawada wrote:
> > >
> > > On Fri, Jul 4, 2025 at 8:18 PM Zhijie Hou (Fujitsu)
> > > <houzj(dot)fnst(at)fujitsu(dot)com>
> > > wrote:
> > > >
> > > > On Wed, Jul 2, 2025 at 3:28 PM Hou, Zhijie wrote:
> > > > > Kindly use the latest patch set for performance testing.
> > > >
> > > > During testing, we observed a limitation in cascading logical
> > > > replication setups, such as (A -> B -> C). When retain_conflict_info
> > > > is enabled on Node C, it may not retain information necessary for
> > > > conflict detection when applying changes originally replicated from
> > > > Node A. This happens because Node C only waits for locally
> > > > originated changes on Node B to be applied before advancing the
> > > > non-removable
> > > transaction ID.
> > > >
> > > > For example, Consider a logical replication setup as mentioned above
> > > > : A -> B
> > > -> C.
> > > > - All three nodes have a table t1 with two tuples (1,1) (2,2).
> > > > - Node B subscribed to all changes of t1 from Node A
> > > > - Node-C subscribed to all changes from Node B.
> > > > - Subscriptions use the default origin=ANY, as this is not a bidirectional
> > > > setup.
> > > >
> > > > Now, consider two concurrent operations:
> > > > - @9:00 Node A - UPDATE (1,1) -> (1,11)
> > > >
> > > > - @9:02 Node C - DELETE (1,1)
> > > >
> > > > Assume a slight delay at Node B before it applies the update from Node A.
> > > >
> > > > @9:03 Node C - advances the non-removable XID because it sees no
> > > > concurrent transactions from Node B. It is unaware of Node A’s
> > > > concurrent
> > > update.
> > > >
> > > > @9:04 Node B - receives Node A's UPDATE and applies (1,1) -> (1,11)
> > > > t1 has tuples : (1,11), (2,2)
> > > >
> > > > @9:05 Node C - receives the UPDATE (1,1) -> (1,11)
> > > > - As conflict slot’s xmin is advanced, the deleted tuple may
> > > > already
> > > have
> > > > been removed.
> > > > - Conflict resolution fails to detect update_deleted and instead raises
> > > > update_missing.
> > > >
> > > > Note that, as per decoding logic Node C sees the commit timestamp of
> > > > the update as 9:00 (origin commit_ts from Node A), not 9:04 (commit
> > > > time on Node B). In this case, since the UPDATE's timestamp is
> > > > earlier than the DELETE, Node C should ideally detect an
> > > > update_deleted conflict. However, it cannot, because it no longer retains
> > the deleted tuple.
> > > >
> > > > Even if Node C attempts to retrieve the latest WAL position from
> > > > Node A, Node C doesn't maintain any LSN which we could use to compare
> > with it.
> > > >
> > > > This scenario is similar to another restriction in the patch where
> > > > retain_conflict_info is not supported if the publisher is also a
> > > > physical standby, as the required transaction information from the
> > > > original primary is unavailable. Moreover, this limitation is
> > > > relevant only when the subscription origin option is set to ANY, as
> > > > only in that case changes from other origins can be replicated.
> > > > Since retain_conflict_info is primarily useful for conflict
> > > > detection in bidirectional clusters where the origin option is set
> > > > to NONE, this limitation
> > > appears acceptable.
> > > >
> > > > Given these findings, to help users avoid unintended configurations,
> > > > we plan to issue a warning in scenarios where replicated changes may
> > > > include origins other than the direct publisher, similar to the
> > > > existing checks in the
> > > > check_publications_origin() function.
> > > >
> > > > Here is the latest patch that implements the warning and documents
> > > > this case. Only 0001 is modified for this.
> > > >
> > > > A big thanks to Nisha for invaluable assistance in identifying this
> > > > case and preparing the analysis for it.
> > >
> > > I'm still reviewing the 0001 patch but let me share some comments and
> > > questions I have so far:
> >
> > Thanks for the comments!
> >
> > >
> > > ---
> > > +/*
> > > + * Determine the minimum non-removable transaction ID across all
> > > +apply workers
> > > + * for subscriptions that have retain_conflict_info enabled. Store
> > > +the result
> > > + * in *xmin.
> > > + *
> > > + * If the replication slot cannot be advanced during this cycle, due
> > > +to either
> > > + * a disabled subscription or an inactive worker, set
> > > +*can_advance_xmin to
> > > + * false.
> > > + */
> > > +static void
> > > +compute_min_nonremovable_xid(LogicalRepWorker *worker,
> > > + bool retain_conflict_info, TransactionId
> > *xmin,
> > > + bool *can_advance_xmin)
> > >
> > > I think this function is quite confusing for several reasons. For
> > > instance, it's doing more things than described in the comments such
> > > as trying to create the CONFLICT_DETECTION_SLOT if no worker is
> > > passed. Also, one of the caller
> > > describes:
> > >
> > > + /*
> > > + * This is required to ensure that we don't advance the xmin
> > > + * of CONFLICT_DETECTION_SLOT even if one of the
> > > subscriptions
> > > + * is not enabled. Otherwise, we won't be able to detect
> > > + * conflicts reliably for such a subscription even though it
> > > + * has set the retain_conflict_info option.
> > > + */
> > > + compute_min_nonremovable_xid(NULL,
> > > sub->retainconflictinfo,
> > > + &xmin,
> > > + &can_advance_xmin);
> > >
> > > but it's unclear to me from the function name that it tries to create
> > > the replication slot. Furthermore, in this path it doesn't actually
> > > compute xmin. I guess we can try to create CONFLICT_DETECTION_SLOT in
> > > the loop of "foreach(lc, sublist)" and set false to can_advance_xmin
> > > if either the subscription is disabled or the worker is not running.
> >
> > I understand. The original code was similar to your suggestion, but we decided
> > to encapsulate it within a separate function to maintain a clean and concise
> > main loop. However, your suggestion also makes sense, so I will proceed with
> > the change.
>
> I have made this change in the 0002 patch for reference. What do you think ? If
> there are no objections, I plan to merge it in the next version.
>
The changes in the 0002 patch look good to me. I've attached the patch
for some minor suggestions. Please incorporate these changes if you
agree.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Attachment | Content-Type | Size |
---|---|---|
change_masahiko_v49.patch | application/octet-stream | 3.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2025-07-18 21:30:43 | Re: Conflict detection for update_deleted in logical replication |
Previous Message | Andres Freund | 2025-07-18 21:03:59 | Re: Adding basic NUMA awareness |