From: | Ajin Cherian <itsajin(at)gmail(dot)com> |
---|---|
To: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
Cc: | Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Clear logical slot's 'synced' flag on promotion of standby |
Date: | 2025-09-18 10:45:46 |
Message-ID: | CAFPTHDbCfZfN6BSX2sN-rcZdv=Px2sKV+YYZ2+cnwLfXMm=oOQ@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Sep 12, 2025 at 1:56 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> The approach seems valid and should work, but introducing a new file
> like promote.inprogress for this purpose might be excessive. We can
> first try analyzing existing information to determine whether we can
> distinguish between the two scenarios -- a primary in crash recovery
> immediately after a promotion attempt versus a regular primary. If we
> are unable to find any way, we can revisit the idea.
>
I needed a way to reset slots not only during promotion, but also
after a crash that occurs while slots are being reset, so there would
be a fallback mechanism to clear them again on startup. As Shveta
pointed out, it wasn’t trivial to tell apart a standby restarting
after crashing during promotion from a primary restarting after a
crash. So I decided to just reset slots every time primary (or a
standby after promotion) restarts.
Because this fallback logic will run on every primary restart, it was
important to minimize overhead added by the patch. After some
discussion, I placed the reset logic in RestoreSlotFromDisk(), which
is invoked by StartupReplicationSlots() whenever the server starts.
Because RestoreSlotFromDisk() already loops through all slots, this
adds minimum extra work; but also ensures the synced flag is cleared
when running on a primary.
The next challenge was finding a reliable flag to distinguish
primaries from standbys, since we really don’t want to reset the flag
on a standby. I tested StandbyMode, RecoveryInProgress(), and
InRecovery. But during restarts, both RecoveryInProgress() and
InRecovery are always true on both primary and standby. In all my
testing, StandbyMode was the only variable that consistently
differentiated between the two, which is what I used.
I have also changed the documentation and comments regarding 'synced'
flags not being reset on the primary.
regards,
Ajin Cherian
Fujitsu Australia
Attachment | Content-Type | Size |
---|---|---|
v2-0001-Reset-synced-slots-when-a-standby-is-promoted.patch | application/octet-stream | 7.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2025-09-18 11:02:56 | Re: [Patch] add new parameter to pg_replication_origin_session_setup |
Previous Message | vignesh C | 2025-09-18 10:31:09 | Re: Logical Replication of sequences |