RE: Conflict detection for update_deleted in logical replication

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: RE: Conflict detection for update_deleted in logical replication
Date: 2025-09-12 10:09:06
Message-ID: TY4PR01MB16907F44AF862041E97432BA49408A@TY4PR01MB16907.jpnprd01.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, September 12, 2025 4:48 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> On Fri, Sep 12, 2025 at 8:55 AM Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com>
> wrote:
> >
> >
> > I agree. Here is a V73 patch that will restart the worker if the
> > retention resumes. I also addressed other comments posted by Amit[1].
> >
>
> Thanks for the patch. Few comments:

Thanks for the comments!

>
> 1)
> There is a small window where worker can exit while resuming retention and
> launcher can end up acessign stale worker info.
>
> Lets say launcher is at a stage where it has fetched worker:
> w = logicalrep_worker_find(sub->oid, InvalidOid, false);
>
> And after this point, before the launcher could do
> compute_min_nonremovable_xid(), the worker has stopped retention and
> resumed as well. Now the worker has exited but in
> compute_min_nonremovable_xid(), launcher will still use the worker-info
> fetched previously.

Thanks for catching this, I have fixed by computing the xid under
LogicalRepWorkerLock.

>
> 2)
>
>   if (should_stop_conflict_info_retention(rdt_data))
> +  {
> +    /*
> +     * Stop retention if not yet. Otherwise, reset to the initial phase
> +to
> +     * retry all phases. This is required to recalculate the current
> +wait
> +     * time and resume retention if the time falls within
> +     * max_retention_duration.
> +     */
> +    if (MySubscription->retentionactive)
> +      rdt_data->phase = RDT_STOP_CONFLICT_INFO_RETENTION;
> +    else
> +      reset_retention_data_fields(rdt_data);
> +
>     return;
> +  }
>
>
>
> Shall we have an Assert( !MyLogicalRepWorker->oldest_nonremovable_xid)
> in 'else' part above?

Added.

Here is the V74 patch which addressed all comments.

Best Regards,
Hou zj

Attachment Content-Type Size
v74-0001-Allow-conflict-relevant-data-retention-to-resume.patch application/octet-stream 18.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2025-09-12 10:09:10 RE: Conflict detection for update_deleted in logical replication
Previous Message Tatsuo Ishii 2025-09-12 09:53:16 Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options