| From: | Alexander Kukushkin <cyberdemn(at)gmail(dot)com> |
|---|---|
| To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
| Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Fabrice Chapuis <fabrice636861(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> |
| Subject: | Re: Issue with logical replication slot during switchover |
| Date: | 2025-11-17 10:40:09 |
| Message-ID: | CAFh8B=nJg2xinHYF2NWL_mt3E9gc6_JqaUVu+eoDYBuP9VKL3A@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Masahiko,
On Fri, 14 Nov 2025 at 23:32, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> Given the current behavior that we cannot create a logical slot with
> failover=true on the standby, it makes sense to me that we overwrite
> the pre-existing slot (with synced=false and failover=true) on the old
> primary by the slot (with synced=true and failover=true) on the new
> primary if their names, plugin and other properties matches and the
> pre-existing slot has lesser LSNs and XIDs than the one on the new
> primary.
From one side the idea to have additional checks looks reasonable, but if I
look at existing update_local_synced_slot() function, I find the following:
if (remote_dbid != slot->data.database ||
remote_slot->two_phase != slot->data.two_phase ||
remote_slot->failover != slot->data.failover ||
strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) != 0 ||
remote_slot->two_phase_at != slot->data.two_phase_at)
{
NameData plugin_name;
/* Avoid expensive operations while holding a spinlock. */
namestrcpy(&plugin_name, remote_slot->plugin);
SpinLockAcquire(&slot->mutex);
slot->data.plugin = plugin_name;
slot->data.database = remote_dbid;
slot->data.two_phase = remote_slot->two_phase;
slot->data.two_phase_at = remote_slot->two_phase_at;
slot->data.failover = remote_slot->failover;
SpinLockRelease(&slot->mutex);
That is, if some synced slot properties on standby don't match with the
primary we simply overwrite them.
I guess this is necessary because synchronization happens only
periodically, and between two runs a slot on the primary might have been
recreated with different properties.
Do we really need to have additional checks to flip a synced flag?
Regards,
--
Alexander Kukushkin
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Zhijie Hou (Fujitsu) | 2025-11-17 10:50:01 | RE: Newly created replication slot may be invalidated by checkpoint |
| Previous Message | Chao Li | 2025-11-17 10:33:26 | Re: CREATE/ALTER PUBLICATION improvements for syntax synopsis |