Re: Conflict detection for update_deleted in logical replication

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Conflict detection for update_deleted in logical replication
Date: 2025-08-08 06:34:00
Message-ID: CAJpy0uBXU_P6To-mFc01UyJQqqOEGhUXNuLVOh1LNxOeMqeY7g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 7, 2025 at 10:10 AM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Tuesday, August 5, 2025 10:09 AM Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > Here is V57 patch set which addressed most of comments.
> >
> > In this version, I also fixed a bug that the apply worker continued to find dead
> > tuples even if it has already stop retaining dead tuples.
>
> Here is a V58 patch set which improved few things by internal review:
>
> 0001:
>

Thank You for the patches, please find a few comments on 001 alone:

1)
+ /*
+ * Return if the wait time has not exceeded the maximum limit
+ * (max_conflict_retention_duration).
+ */
+ if (!TimestampDifferenceExceeds(rdt_data->candidate_xid_time, now,
+ max_conflict_retention_duration +
+ rdt_data->table_sync_wait_time))

We can add comments here as in why we are adding table-sync time to
max_conflict_retention_duration.

2)
relmutex comment says:

/* Used for initial table synchronization. */
Oid relid;
char relstate;
XLogRecPtr relstate_lsn;
slock_t relmutex;

We shall update this comment as now we are using it for other
purposes. Also name is specific to relation (due to originally created
for table-sync case). We can rename it to be more general so that it
can be used for oldest-xid access purposes as well.

3)
+ Assert(TransactionIdIsValid(rdt_data->candidate_xid));
+ Assert(rdt_data->phase == RDT_WAIT_FOR_PUBLISHER_STATUS ||
+ rdt_data->phase == RDT_WAIT_FOR_LOCAL_FLUSH);
+
+ if (!max_conflict_retention_duration)
+ return false;

Shall we move 'max_conflict_retention_duration' NULL check as the
first step. Or do you think it will be better to move it to the caller
before should_stop_conflict_info_retention is invoked?

4)
+ The information useful for conflict detection is no longer retained if
+ all apply workers associated with the subscriptions, where
+ <literal>retain_dead_tuples</literal> is enabled, confirm that the
+ retention duration exceeded the
+ <literal>max_conflict_retention_duration</literal>. To re-enable
+ retention, you can disable <literal>retain_dead_tuples</literal> and
+ re-enable it after confirming this replication slot has been dropped.

But the replication slot will not be dropped unless all the
subscriptions have disabled retain_dead_tuples. So shall the above doc
somehow mention this part as well otherwise it could be misleading for
users.

5)
pg_stat_subscription_stats: retain_dead_tuples

Can it cause confusion as both subscription's parameter and
pg_stat_subscription_stats's column have the same name while may have
different values. Shall the stats one be named as
'effective_retain_dead_tuples'?

thanks
Shveta

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema-Nio 2025-08-08 06:44:30 Re: BackendKeyData is mandatory?
Previous Message Fujii Masao 2025-08-08 05:40:02 Re: Bug in pg_dump --filter? - Invalid object types can be misinterpreted as valid