Re: Add an option to skip loading missing publication to avoid logical replication failure

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add an option to skip loading missing publication to avoid logical replication failure
Date: 2025-03-13 05:19:00
Message-ID: CALDaNm2gfsxQL7VrurNnJffCBcuyrZQ1LXDOhf9FY57n8=8cVw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 13 Mar 2025 at 09:18, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> Thanks looks good to me.
>
> While looking at the patch, I have a few comments/questions
>
> + if (pub)
> + result = lappend(result, pub);
> + else
> + {
> + /*
> + * When executing 'ALTER SUBSCRIPTION ... SET PUBLICATION', the
> + * apply worker continues using the existing replication slot and
> + * origin after restarting. If the replication origin is not
> + * updated before the restart, the WAL start location may point to
> + * a position before the specified publication exists, causing
> + * persistent apply worker restarts and errors.
> + *
> + * This ensures that the publication is skipped if it does not
> + * exist and is loaded when the corresponding WAL record is
> + * encountered.
> + */
> + ereport(WARNING,
> + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("skipped loading publication: %s", pubname),
> + errhint("If the publication already exists, ignore it as it will be
> loaded upon reaching the corresponding WAL record; otherwise, create
> it."));
> + }
>
> This comment focuses on a specific use case regarding the problem with
> 'ALTER SUBSCRIPTION ... SET PUBLICATION,' but in reality, we are
> addressing a more general case where the user is trying to SET
> PUBLICATION or even CREATE SUBSCRIPTION, and some publications are
> missing. Wouldn't it be better to rephrase the comment?

How about a comment something like below:
/*
* In 'ALTER SUBSCRIPTION ... ADD/SET PUBLICATION' and
* 'CREATE SUBSCRIPTION', if the replication origin is not updated
* before the worker exits, the WAL start location might point to a
* position before the publication's WAL record. This can lead to
* persistent apply worker restarts and errors.
*
* Additionally, dropping a subscription's publication should not
* disrupt logical replication.
*
* This ensures that a missing publication is skipped and loaded
* when its corresponding WAL record is encountered.
*/
ereport(WARNING,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("skipped loading publication: %s", pubname),
errhint("If the publication is missing, create and refresh it.
Otherwise, wait for the slot to reach the WAL record, then refresh"));

> 2. + errhint("If the publication already exists, ignore it as it will
> be loaded upon reaching the corresponding WAL record; otherwise,
> create it."));
>
> Is this hint correct? This is a question rather than a comment: When
> we reach a particular WAL where the publication was created, will the
> publication automatically load, or does the user need to REFRESH the
> publications?

Users need to refresh the publication in case the relation is not
already added to pg_subscription_rel and apply incremental changes.
How about an error hint like:
"If the publication is missing, create and refresh it. Otherwise, wait
for the slot to reach the WAL record for the created publication, then
refresh"

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2025-03-13 05:22:41 Re: Commitfest Manager for March
Previous Message David Rowley 2025-03-13 04:53:42 Re: [PoC] Reducing planning time when tables have many partitions