Re: Issue with logical replication slot during switchover

From: Alexander Kukushkin <cyberdemn(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Fabrice Chapuis <fabrice636861(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
Subject: Re: Issue with logical replication slot during switchover
Date: 2025-11-12 07:28:15
Message-ID: CAFh8B=nwHPrBRe7_wECYmu-bFbDoGy+LdPxAVc-V+9Xa+USrYg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Amit,

On Wed, 12 Nov 2025 at 05:22, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> It is difficult to tell when this can happen but you detailed there is
> a theoretical possibility of the same. If we had an in-core cluster
> tool that manages nodes on its own which doesn't allow such scenarios
> to happen then we could possibly say that using such a tool it is safe
> to overwrite old primary's slots.

That's a lot of ifs, and none of them could be fulfilled in the foreseeable
future.

Situation you describe is impossible.
When there is a split-brain and someone drops and re-creating logical slots
with the same names on the old primary - such node can't be joined as a
standby without pg_rewind.
In its current state pg_rewind wipes the pg_replslot directory, and
therefore there will be no replication slots.
That is, if there is a logical replication slot with failover=true and
synced=false on a healthy standby, it could have happened only because the
old primary was shut down gracefully.

Regards,
--
Alexander Kukushkin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Xuneng Zhou 2025-11-12 07:32:19 Re: Rename sync_error_count to tbl_sync_error_count in subscription statistics
Previous Message Yugo Nagata 2025-11-12 07:20:13 Re: Make PQgetResult() not return NULL on out-of-memory error