RE: Avoid retaining conflict-related data when no tables are subscribed

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: RE: Avoid retaining conflict-related data when no tables are subscribed
Date: 2025-09-02 05:21:11
Message-ID: TY4PR01MB16907424D996A8930F6D293AF9406A@TY4PR01MB16907.jpnprd01.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, August 29, 2025 12:05 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Aug 28, 2025 at 7:54 AM Zhijie Hou (Fujitsu)
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > Hi,
> >
> > My colleague Nisha reported an issue to me off-list: dead tuples can't
> > be removed when retain_dead_tuples is enabled for a subscription with no
> tables.
> >
> > This appears to stem from the inability to advance the non-removable
> > transaction ID when AllTablesyncsReady() returns false. Since this
> > function returns false when no tables are present, which leads to
> > unnecessary data retention until a table is added to the subscription.
> >
> > Since dead tuples don't need to be retained when no tables are
> > subscribed, here is a patch to fix it, modifying AllTablesyncsReady()
> > to allows no tables to be treated as a ready state when explicitly requested.
> >
>
> Few comments:
> ============
> Aren't following two paragraphs in comments contradict each other:
>
>    * It is safe to add new tables with initial states to the subscription
>    * after this check because any changes applied to these tables should
>    * have a WAL position greater than the rdt_data->remote_lsn.
> +   *
> +   * Advancing the transaction ID is also necessary when no tables are
> +   * subscribed, as it prevents unnecessary retention of dead tuples.
> +Although
> +   * it seem feasible to skip all phases and directly assign
> +candidate_xid to
> +   * oldest_nonremovable_xid in the RDT_GET_CANDIDATE_XID phase
> when no
> +tables
> +   * are currently subscribed, this approach is unsafe. This is because
> +new
> +   * tables may be added to the subscription after the initial table
> +check,
> +   * requiring tuples deleted before candidate_xid for conflict
> +detection in
> +   * upcoming transactions. Therefore, it remains necessary to wait for
> +all
> +   * concurrent transactions to be fully applied.
>    */
>
> In the first para, the comments say that it is okay to add tables after this check
> and in the second para, it says that is not okay?

I have removed the 1st para because it's inaccurate.

>
> 2.
> + * If the subscription has no tables, return the value determined by
> + * 'ready_if_no_tables'.
> + *
> + * Otherwise, return whether all the tables for the subscription are in
> + the
> + * READY state.
> *
> * Note: This function is not suitable to be called from outside of apply or
> * tablesync workers because MySubscription needs to be already initialized.
> */
> bool
> -AllTablesyncsReady(void)
> +AllTablesyncsReady(bool ready_if_no_tables)
>
> This change serves the purpose but I find it makes the API complex to
> understand because now it needs to make decisions based on different states
> depending on the boolean parameter passed. Can we introduce a new API for
> the empty subscription case?

Added a new function HasSubscriptionRelationsCached() as suggested.

Attached is the V2 patch which addresses the above comments and
fixes a typo.

Apart from the original reported issue, we noticed another point that can be
improved during the implementation of update_deleted patches: When a disabled
subscription is created with retain_dead_tuples set to true, the launcher is not
woken up immediately, which may lead to delays in creating the conflict
detection slot and cause user confusion. So, I prepared the 0002 patch to fix
this issue.

Best Regards,
Hou zj

Attachment Content-Type Size
v2-0001-Avoid-retaining-conflict-related-data-when-no-tab.patch application/octet-stream 3.7 KB
v2-0002-Wakeup-launcher-on-subscription-creation-to-creat.patch application/octet-stream 1.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chao Li 2025-09-02 05:27:29 Re: SQL:2023 JSON simplified accessor support
Previous Message shveta malik 2025-09-02 04:55:40 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart