Re: Conflict detection for update_deleted in logical replication

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>
Subject: Re: Conflict detection for update_deleted in logical replication
Date: 2025-08-19 19:23:44
Message-ID: CAD21AoAkZ8WJ19Ls3pfPx78DY_dRXPf2Ad0W=CkfBZkVYerHaA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Aug 17, 2025 at 10:06 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Sat, Aug 16, 2025 at 5:15 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > Regarding the subscription-level option vs. GUC, I don't disagree with
> > the current approach.
> >
> > For the record, while I agree that the subscription-level option is
> > more consistent with the existing retain_dead_tuples option and can
> > work for different requirements, my biggest concern is that if users
> > set different values to different subscriptions, they might think it
> > doesn't work as expected. This is because even if a subscription
> > disables retaining dead tuples due to max_conflict_retention_duration,
> > the database cluster doesn't stop retaining dead tuples unless all
> > other subscriptions disable it, meaning that the performance on that
> > database might not get recovered. I proposed the GUC parameter as I
> > thought it's less confusing from a user perspective. I'm not sure it's
> > sufficient to mention that in the documentation but I don't have a
> > better idea.
> >
>
> I think we might want to gave a GUC as well in the future as both
> (subscription option and GUC) can be used in different scenarios but
> better to wait for some user feedback before going that far. We can
> document this in the option to make users aware how to use it in such
> situations.

Okay.

>
> >
> > ---
> > +static void
> > +notify_ineffective_max_conflict_retention(bool update_maxconflretention)
> > +{
> > + ereport(NOTICE,
> > + errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > + update_maxconflretention
> > + ? errmsg("max_conflict_retention_duration has no effect
> > when retain_dead_tuples is disabled")
> > + : errmsg("disabling retain_dead_tuples will render
> > max_conflict_retention_duration ineffective"));
> > +}
> >
> > Given that max_conflict_retention_duration works only when
> > retain_dead_tuples is enabled, why not merge these two parameters? For
> > example, setting max_conflict_retention_duration=-1 means to disable
> > retaining dead tuples behavior and =0 means that dead tuples are
> > retained until they are no longer needed for detection purposes.
> >
>
> I think it can be a source of confusion for users, if not now, then in
> the future. Consider that in the future, we also have a GUC to set the
> retention duration which will be used for all subscriptions. Now, say,
> we define the behaviour such that if this value is set for
> subscription, then use that, otherwise, use the GUC value. Then, with
> this proposal, if the user sets max_conflict_retention_duration=0, it
> will lead to retaining tuples until they are no longer needed but with
> the behaviour proposed in patch, one could have simply set
> retain_dead_tuples=true and used the GUC value. I understand that it
> is debatable how we will design the GUC behaviour in future but I used
> it as an example how trying to achieve different things with one
> option can be a source of confusion. Even if we decide not to
> introduce GUC or define its behaviour differently, I find having
> different options in this case is easy to understand and use.

Agreed.

>
> > ---
> > + /*
> > + * Only the leader apply worker manages conflict retention (see
> > + * maybe_advance_nonremovable_xid() for details).
> > + */
> > + if (!isParallelApplyWorker(&worker) && !isTablesyncWorker(&worker))
> > + values[10] = TransactionIdIsValid(worker.oldest_nonremovable_xid);
> > + else
> > + nulls[10] = true;
> >
> > I think that adding a new column to the pg_stat_subscription view
> > should be implemented in a separate patch since it needs to bump the
> > catalog version while introducing max_conflict_retention_duration
> > subscription option doesn't require that.
> >
>
> Won't following change in pg_subsciption.h anyway requires us to bump
> catversion?

Opps, you're right. I missed that part.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-08-19 19:50:19 Re: VM corruption on standby
Previous Message Nathan Bossart 2025-08-19 19:18:40 Re: pg_upgrade: transfer pg_largeobject_metadata's files when possible