Re: Clear logical slot's 'synced' flag on promotion of standby

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clear logical slot's 'synced' flag on promotion of standby
Date: 2025-09-09 23:52:33
Message-ID: CAD21AoAx1OHmwgcVTT08zE2vjYF-vumRkr-caJc+gcKFRbua8Q@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 8, 2025 at 11:21 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> Hi,
>
> This is a spin-off thread from [1].
>
> Currently, in the slot-sync worker, we have an error scenario [2]
> where, during slot synchronization, if we detect a slot with the same
> name and its synced flag is set to false, we emit an error. The
> rationale is to avoid potentially overwriting a user-created slot.
>
> But while analyzing [1], we observed that this error can lead to
> inconsistent behavior during switchovers. On the first switchover, the
> new standby logs an error: "Exiting from slot synchronization because
> a slot with the same name already exists on the standby." But during
> a double switchover, this error does not occur.
>
> Upon re-evaluating this, it seems more appropriate to clear the synced
> flag after promotion, as the flag does not hold any meaning on the
> primary. Doing so would ensure consistent behavior across all
> switchovers, as the same error will be raised avoiding the risk of
> overwriting user's slots.

There is the following comment in FinishWalRecovery():

/*
* Shutdown the slot sync worker to drop any temporary slots acquired by
* it and to prevent it from keep trying to fetch the failover slots.
*
* We do not update the 'synced' column in 'pg_replication_slots' system
* view from true to false here, as any failed update could leave 'synced'
* column false for some slots. This could cause issues during slot sync
* after restarting the server as a standby. While updating the 'synced'
* column after switching to the new timeline is an option, it does not
* simplify the handling for the 'synced' column. Therefore, we retain the
* 'synced' column as true after promotion as it may provide useful
* information about the slot origin.
*/
ShutDownSlotSync();

Does the patch address the above concerns?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexandra Wang 2025-09-09 23:53:04 Re: SQL:2023 JSON simplified accessor support
Previous Message Rishu Bagga 2025-09-09 23:14:22 Re: Proposal: Out-of-Order NOTIFY via GUC to Improve LISTEN/NOTIFY Throughput