| From: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
|---|---|
| To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
| Cc: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> |
| Subject: | Re: Conflict detection for update_deleted in logical replication |
| Date: | 2025-08-18 06:32:15 |
| Message-ID: | CAFiTN-v8GXcxaQm1H_cdfnx35rRsYTW41Ms8wo4t9XyOg_7EPQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Mon, Aug 18, 2025 at 10:36 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> > Given that max_conflict_retention_duration works only when
> > retain_dead_tuples is enabled, why not merge these two parameters? For
> > example, setting max_conflict_retention_duration=-1 means to disable
> > retaining dead tuples behavior and =0 means that dead tuples are
> > retained until they are no longer needed for detection purposes.
> >
>
> I think it can be a source of confusion for users, if not now, then in
> the future. Consider that in the future, we also have a GUC to set the
> retention duration which will be used for all subscriptions. Now, say,
> we define the behaviour such that if this value is set for
> subscription, then use that, otherwise, use the GUC value. Then, with
> this proposal, if the user sets max_conflict_retention_duration=0, it
> will lead to retaining tuples until they are no longer needed but with
> the behaviour proposed in patch, one could have simply set
> retain_dead_tuples=true and used the GUC value. I understand that it
> is debatable how we will design the GUC behaviour in future but I used
> it as an example how trying to achieve different things with one
> option can be a source of confusion. Even if we decide not to
> introduce GUC or define its behaviour differently, I find having
> different options in this case is easy to understand and use.
I agree that merging these 2 will create confusion and usability issues.
>
> > ---
> > + /*
> > + * Only the leader apply worker manages conflict retention (see
> > + * maybe_advance_nonremovable_xid() for details).
> > + */
> > + if (!isParallelApplyWorker(&worker) && !isTablesyncWorker(&worker))
> > + values[10] = TransactionIdIsValid(worker.oldest_nonremovable_xid);
> > + else
> > + nulls[10] = true;
> >
> > I think that adding a new column to the pg_stat_subscription view
> > should be implemented in a separate patch since it needs to bump the
> > catalog version while introducing max_conflict_retention_duration
> > subscription option doesn't require that.
> >
>
> Won't following change in pg_subsciption.h anyway requires us to bump
> catversion?
> @@ -81,6 +81,10 @@
> CATALOG(pg_subscription,6100,SubscriptionRelationId)
> BKI_SHARED_RELATION BKI_ROW
> bool subretaindeadtuples; /* True if dead tuples useful for
> * conflict detection are retained */
>
> + int32 maxconflretention; /* The maximum duration (in milliseconds)
> + * for which information useful for
> + * conflict detection can be retained */
> +
>
> > ---
> > Even if an apply worker disables retaining dead tuples due to
> > max_conflict_retention_duration, it enables again after the server
> > restarts.
> >
>
> I also find this behaviour questionable because this also means that
> it is possible that before restart one would deduce that the
> update_deleted conflict won't be reliably detected for a particular
> subscription but after restart it could lead to the opposite
> conclusion. But note that to make it behave similarly we need to store
> this value persistently in pg_subscription unless you have better
> ideas for this. Theoretically, there are two places where we can
> persist this information, one is with pg_subscription, and other in
> origin. I find it is closer to pg_subscription.
I think it makes sense to store this in pg_subscription to preserve
the decision across restart.
--
Regards,
Dilip Kumar
Google
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Chao Li | 2025-08-18 06:35:42 | Re: GB18030-2022 Support in PostgreSQL |
| Previous Message | Hayato Kuroda (Fujitsu) | 2025-08-18 06:30:18 | RE: memory leak in logical WAL sender with pgoutput's cachectx |