Re: Fix pg_upgrade to detect invalid logical replication slots on PG19

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Lakshmi N <lakshmin(dot)jhs(at)gmail(dot)com>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Fix pg_upgrade to detect invalid logical replication slots on PG19
Date: 2026-04-22 04:27:10
Message-ID: CAJpy0uD3Tre0jEBKZoVsxgdHQGTVTXW5hGmiOxB7TG=eWAABGw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 20, 2026 at 2:28 PM Lakshmi N <lakshmin(dot)jhs(at)gmail(dot)com> wrote:
>
> Hi Hackers,
>
> The PG19-optimized slot catchup query uses a CTE that filters on
> invalidation_reason IS NULL, then cross-joins it with the main slot
> query. When ALL logical slots in a database are invalid, the CTE
> returns zero rows, and the cross join produces an empty result set.
> This causes pg_upgrade to silently skip those slots entirely --
> neither detecting them as invalid (which should block the upgrade)
> nor attempting to migrate them.
>
> The pre-PG19 query path does not have this problem because it queries
> pg_replication_slots directly without a cross join. This may not impact
> upgrade to PG19 but will change the behavior for PG20 upgrade.
>
> Fix by changing the cross join to a LEFT JOIN,
> so that invalid slots still appear in the result set with NULL
> caught_up values.
>

I agree with the problem here.

Another way to solve this would be using a scalar subquery(see [1]),
but that would reduce readability. Thus, I prefer a LEFT OUTER JOIN on
TRUE here. There should also be no performance impact, since the
right-hand side query will always return at most one row due to the
LIMIT 1 clause. So IMO, the proposed solution is good. Copying
Sawada-san, as he was the author of the original patch.

[1]:
SELECT slot_name, plugin, two_phase, failover,
CASE
WHEN invalidation_reason IS NOT NULL THEN FALSE
ELSE (
(SELECT last_pending_wal FROM check_caught_up) IS NULL
OR confirmed_flush_lsn > (SELECT last_pending_wal FROM check_caught_up)
)
END as caught_up,
invalidation_reason IS NOT NULL as invalid
FROM pg_catalog.pg_replication_slots
WHERE slot_type = 'logical'
AND database = current_database()
AND temporary IS FALSE;

thanks
Shveta

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message ji xu 2026-04-22 04:36:28 Fwd: Fix translation error in zh_CN.po for "parameter specified more than once"
Previous Message Nishant Sharma 2026-04-22 04:27:00 Re: [BUG] CRASH: ECPGprepared_statement() and ECPGdeallocate_all() when connection is NULL