Re: Conflict detection for update_deleted in logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Conflict detection for update_deleted in logical replication
Date: 2025-05-16 05:45:01
Message-ID: CAA4eK1LUMCLAnrGjbBYz+LvsgoLJBrzYQrp63E0x_ZTwEKvHEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 25, 2025 at 4:06 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> 2)
> in wait_for_local_flush(), we have
> should_stop_conflict_info_retention() before 'AllTablesyncsReady'
> check. Should we give a discount for table-sync time and avoid doing
> stop-conflict-retention when table-sync is going on? This is because
> table-sync is one time operation (or done only on
> subscription-refresh), so we shall not count time spent in table-sync
> for 'max_conflict_retention_duration'. We can reset our timer if
> table-sync is observed to be going on. Thoughts?
>

Sounds reasonable to me.

>
> 3)
> In get_candidate_xid(), we first set candidate_xid_time and later
> candidate_xid. And between these 2 there are chances that we return
> without updating candidate_xid. See 'Return if the
> oldest_nonremovable_xid cannot be advanced ' comment. That will leave
> 'candidate_xid_time' set to new value while 'candidate_xid' is not
> yet set.
>

Good point. I think we should set 'candidate_xid_time' along with
candidate_xid (just after setting candidate_xid).

> 4)
> Do you think there should be some relation between
> 'xid_advance_interval' and 'max_conflict_retention_duration'? Should
> max of 'xid_advance_interval' be limited by
> 'max_conflict_retention_duration'. Currently say
> xid_advance_interval' is set to max 3 mins, what if
> 'max_conflict_retention_duration' is set to 2 mins? In that case we
> will not even check for new xids before 3 mins are over, while
> 'max_conflict_retention_duration' sets a limit of 2 mins for dead
> tuples retention.
>

Right, ideally, the 'xid_advance_interval' should be set to a value
less than 'max_conflict_retention_duration' when no new_xid is found.

BTW, another related point is that when we decide to stop retaining
dead tuples (via should_stop_conflict_info_retention), should we also
consider the case that the apply worker didn't even try to get the
publisher status because previously it decided that
oldest_nonremovable_xid cannot be advanced due to its
OldestActiveTransactionId?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2025-05-16 05:45:20 Re: Conflict detection for update_deleted in logical replication
Previous Message Daniil Davydov 2025-05-16 05:10:10 Re: POC: Parallel processing of indexes in autovacuum