Re: Issue with logical replication slot during switchover

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Alexander Kukushkin <cyberdemn(at)gmail(dot)com>
Cc: Fabrice Chapuis <fabrice636861(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
Subject: Re: Issue with logical replication slot during switchover
Date: 2025-11-10 12:10:14
Message-ID: CAA4eK1+Exdc27D2SByoO+iaSjacgfoAYgWDHXW=pYCq49CBFGA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 31, 2025 at 2:58 PM Alexander Kukushkin <cyberdemn(at)gmail(dot)com> wrote:
>
> Instead of dropping such slots, what we actually need is a way to safely set synced=false->true and continue operating.
>
> Operating logical replication setups is already extremely complex and error-prone — this is not theoretical, it’s something many of us face daily.
> So rather than adding more speculative features or workarounds, I think we should focus on addressing real operational pain points and the inconsistencies in the current design.
>
> A slot created on the primary (which later becomes a standby) with failover=true has a very clear purpose. The failover flag already indicates that purpose; synced shouldn’t override it.
>

I think this is not as clear as you are saying as compared to WAL. In
failover cases, we bump the WAL timelines on new primary and also have
facilities like pg_rewind to ensure that old primary can follow the
new primary after divergence. For slots, there is no such facility,
now, there is an argument that for slot's it is sufficient to match
the name and failover to say that it is okay to overwrite the slot on
old primary. However, it is not clear whether it is always safe to do
so, for example, if the old primary ran after divergence for sometime
and one has re-created the slot with same name and failover property,
it will no longer be the same slot. Unlike WAL, we don't maintain the
slot's history, so it is not equally clear that we can overwrite old
primary's slot's as it is.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2025-11-10 12:13:20 Re: Move SLRU_PAGES_PER_SEGMENT to pg_config_manual.h
Previous Message Dagfinn Ilmari Mannsåker 2025-11-10 12:02:12 Re: gen_guc_tables.pl: Validate required GUC fields before code generation