| From: | Fabrice Chapuis <fabrice636861(at)gmail(dot)com> |
|---|---|
| To: | Alexander Kukushkin <cyberdemn(at)gmail(dot)com> |
| Cc: | shveta malik <shveta(dot)malik(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> |
| Subject: | Re: Issue with logical replication slot during switchover |
| Date: | 2025-11-04 11:14:54 |
| Message-ID: | CAA5-nLC=3QMehrktEB1MOrAk_ynG_u=pD_iUOup4MmkN2Xv0sA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Alexander,
I understand your use case but I think it won't work like this because of
how is implemented get_local_synced_slots()
this function will return slots which synced flag to true only.
in point 4 Start node1: synced flag is false and failover flag is true
then it won't enter the loop and the slot will not be dropped
static void
drop_local_obsolete_slots(List *remote_slot_list)
{
List *local_slots = get_local_synced_slots();
foreach_ptr(ReplicationSlot, local_slot, local_slots)
{
/* Drop the local slot if it is not required to be retained. */
if (!local_sync_slot_required(local_slot, remote_slot_list)) => will check
of remote slot exist and if local slot is not invalidated
{
bool synced_slot;
...
We could change the function get_local_synced_slots to make it work
get_local_synced_slots(void)
{
List *local_slots = NIL;
LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
for (int i = 0; i < max_replication_slots; i++)
{
ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
/* Check if it is a synchronized slot */
if (s->in_use && (s->data.failover || s->data.synced))
{
Assert(SlotIsLogical(s));
local_slots = lappend(local_slots, s);
Thanks for your feedback
Regards,
Fabrice
....
On Mon, Nov 3, 2025 at 8:51 AM Alexander Kukushkin <cyberdemn(at)gmail(dot)com>
wrote:
> Hi Fabrice,
>
> On Fri, 31 Oct 2025 at 17:45, Fabrice Chapuis <fabrice636861(at)gmail(dot)com>
> wrote:
>
>> Thanks for your large analyze and explanation Alexander. If we go in the
>> direction you propose and leave the option to use a suplementary flag
>> allow_overwrite, may I propose you some modifications in the patch v0 you
>> have attached:
>>
>> Why modifying this function?
>>
>> drop_local_obsolete_slots(List *remote_slot_list)
>>
>
> Think about the following case:
> 1. We have node1 (primary) and node2 (standby), with slot
> foo(failover=true)
> 2. We stop node1 for maintenance and promote node2
> 3. DROP replication slot on node2
> 4. Start node1
>
> In this scenario the logical slot "foo" will be left unattended and it
> needs to be dropped, because it doesn't exist on the primary.
>
> As I said earlier, IMO it is only the "failover" flag that effectively
> indicates the purpose of the slot. The sync flag is purely informative.
>
> Regards,
> --
> Alexander Kukushkin
>
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2025-11-04 11:17:02 | Re: [PATCH] Add error message for out-of-memory in passwordFromFile() |
| Previous Message | Ashutosh Bapat | 2025-11-04 10:58:55 | Re: Report bytes and transactions actually sent downtream |