| From: | Fabrice Chapuis <fabrice636861(at)gmail(dot)com> |
|---|---|
| To: | Alexander Kukushkin <cyberdemn(at)gmail(dot)com> |
| Cc: | shveta malik <shveta(dot)malik(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> |
| Subject: | Re: Issue with logical replication slot during switchover |
| Date: | 2025-11-07 17:34:33 |
| Message-ID: | CAA5-nLDqGFgZhV1T2GwjRAAWA-b0oBuWCkd1g1g0Aup45Dpm8A@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
Here is the new version of the patch
Regards,
Fabrice
On Tue, Nov 4, 2025 at 4:13 PM Fabrice Chapuis <fabrice636861(at)gmail(dot)com>
wrote:
> I propose to modify
>
> get_local_synced_slots() => to get the local logical slots (the name of
> the function must be renamed)
> and
> local_sync_slot_required() => to get also the slots that have the flag
> synced = false
>
> then in both our use cases the local slot will be dropped and recreated if
> it must be resynchronized.
> Changes in synchronize_one_slot() function are not necessary anymore.
>
> static List *
> get_local_synced_slots(void)
> {
> List *local_slots = NIL;
>
> LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
>
> for (int i = 0; i < max_replication_slots; i++)
> {
> ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
>
> /* Check if it is a synchronized slot */
> if (s->in_use && (s->data.synced || s->data.failover))
> {
> Assert(SlotIsLogical(s));
> local_slots = lappend(local_slots, s);
> }
> }
>
> LWLockRelease(ReplicationSlotControlLock);
>
> return local_slots;
> }
>
> static bool
> local_sync_slot_required(ReplicationSlot *local_slot, List *remote_slots)
> {
> bool remote_exists = false;
> bool locally_invalidated = false;
> bool synced_slot = false;
>
> foreach_ptr(RemoteSlot, remote_slot, remote_slots)
> {
> if (strcmp(remote_slot->name, NameStr(local_slot->data.name)) == 0)
> {
> remote_exists = true;
>
> /*
> * If remote slot is not invalidated but local slot is marked as
> * invalidated, then set locally_invalidated flag.
> */
> SpinLockAcquire(&local_slot->mutex);
> locally_invalidated =
> (remote_slot->invalidated == RS_INVAL_NONE) &&
> (local_slot->data.invalidated != RS_INVAL_NONE);
> synced_slot = local_slot->data.synced;
> SpinLockRelease(&local_slot->mutex);
>
> break;
> }
> }
>
> return (remote_exists && !locally_invalidated && synced_slot);
> }
>
>
> I could propose a new patch in that sense.
>
> Regards,
>
> Fabrice
>
>
> On Tue, Nov 4, 2025 at 2:01 PM Alexander Kukushkin <cyberdemn(at)gmail(dot)com>
> wrote:
>
>>
>>
>> On Tue, 4 Nov 2025 at 12:15, Fabrice Chapuis <fabrice636861(at)gmail(dot)com>
>> wrote:
>>
>>> Hi Alexander,
>>> I understand your use case but I think it won't work like this because
>>> of how is implemented get_local_synced_slots()
>>> this function will return slots which synced flag to true only.
>>>
>>
>> Yeah. Sorry, my bad, you are absolutely right
>> about get_local_synced_slots() function.
>> I didn't validate this part of the patch, because first I wanted to get
>> feedback from hackers.
>> However, it looks like the people who build it never run Postgres in
>> production and therefore don't understand the problem and the pain.
>> I will probably just add yet another workaround to Patroni and start
>> dropping logical replication slots on standby with failover=true and
>> synced=false.
>>
>> --
>> Regards,
>> --
>> Alexander Kukushkin
>>
>
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0001-Resync-logical-replication-slot-after-switchover.patch | application/octet-stream | 3.1 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Mario González Troncoso | 2025-11-07 17:46:08 | Re: [PATCH] Add pg_get_role_ddl() functions for role recreation |
| Previous Message | Bryan Green | 2025-11-07 17:28:48 | Re: [PATCH] O_CLOEXEC not honored on Windows - handle inheritance chain |