Re: Issue with logical replication slot during switchover

From: Fabrice Chapuis <fabrice636861(at)gmail(dot)com>
To: Alexander Kukushkin <cyberdemn(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
Subject: Re: Issue with logical replication slot during switchover
Date: 2025-11-07 17:34:33
Message-ID: CAA5-nLDqGFgZhV1T2GwjRAAWA-b0oBuWCkd1g1g0Aup45Dpm8A@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,
Here is the new version of the patch

Regards,

Fabrice

On Tue, Nov 4, 2025 at 4:13 PM Fabrice Chapuis <fabrice636861(at)gmail(dot)com>
wrote:

> I propose to modify
>
> get_local_synced_slots() => to get the local logical slots (the name of
> the function must be renamed)
> and
> local_sync_slot_required() => to get also the slots that have the flag
> synced = false
>
> then in both our use cases the local slot will be dropped and recreated if
> it must be resynchronized.
> Changes in synchronize_one_slot() function are not necessary anymore.
>
> static List *
> get_local_synced_slots(void)
> {
> List *local_slots = NIL;
>
> LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
>
> for (int i = 0; i < max_replication_slots; i++)
> {
> ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
>
> /* Check if it is a synchronized slot */
> if (s->in_use && (s->data.synced || s->data.failover))
> {
> Assert(SlotIsLogical(s));
> local_slots = lappend(local_slots, s);
> }
> }
>
> LWLockRelease(ReplicationSlotControlLock);
>
> return local_slots;
> }
>
> static bool
> local_sync_slot_required(ReplicationSlot *local_slot, List *remote_slots)
> {
> bool remote_exists = false;
> bool locally_invalidated = false;
> bool synced_slot = false;
>
> foreach_ptr(RemoteSlot, remote_slot, remote_slots)
> {
> if (strcmp(remote_slot->name, NameStr(local_slot->data.name)) == 0)
> {
> remote_exists = true;
>
> /*
> * If remote slot is not invalidated but local slot is marked as
> * invalidated, then set locally_invalidated flag.
> */
> SpinLockAcquire(&local_slot->mutex);
> locally_invalidated =
> (remote_slot->invalidated == RS_INVAL_NONE) &&
> (local_slot->data.invalidated != RS_INVAL_NONE);
> synced_slot = local_slot->data.synced;
> SpinLockRelease(&local_slot->mutex);
>
> break;
> }
> }
>
> return (remote_exists && !locally_invalidated && synced_slot);
> }
>
>
> I could propose a new patch in that sense.
>
> Regards,
>
> Fabrice
>
>
> On Tue, Nov 4, 2025 at 2:01 PM Alexander Kukushkin <cyberdemn(at)gmail(dot)com>
> wrote:
>
>>
>>
>> On Tue, 4 Nov 2025 at 12:15, Fabrice Chapuis <fabrice636861(at)gmail(dot)com>
>> wrote:
>>
>>> Hi Alexander,
>>> I understand your use case but I think it won't work like this because
>>> of how is implemented get_local_synced_slots()
>>> this function will return slots which synced flag to true only.
>>>
>>
>> Yeah. Sorry, my bad, you are absolutely right
>> about get_local_synced_slots() function.
>> I didn't validate this part of the patch, because first I wanted to get
>> feedback from hackers.
>> However, it looks like the people who build it never run Postgres in
>> production and therefore don't understand the problem and the pain.
>> I will probably just add yet another workaround to Patroni and start
>> dropping logical replication slots on standby with failover=true and
>> synced=false.
>>
>> --
>> Regards,
>> --
>> Alexander Kukushkin
>>
>

Attachment Content-Type Size
v1-0001-Resync-logical-replication-slot-after-switchover.patch application/octet-stream 3.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mario González Troncoso 2025-11-07 17:46:08 Re: [PATCH] Add pg_get_role_ddl() functions for role recreation
Previous Message Bryan Green 2025-11-07 17:28:48 Re: [PATCH] O_CLOEXEC not honored on Windows - handle inheritance chain