From: | Fabrice Chapuis <fabrice636861(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: failover logical replication slots |
Date: | 2025-07-11 15:12:02 |
Message-ID: | CAA5-nLC2__W71QmQtZ37Cm0-6jf5ZJUkjbb2QqrR1HYTNB3M=g@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Amit,
Here is a proposed solution to handle the problem of creating the logical
replication slot on standby after a switchover.
Thank you for your comments and help on this issue
Regards
Fabrice
diff --git a/src/backend/replication/logical/slotsync.c
b/src/backend/replication/logical/slotsync.c
index 656e66e..296840a 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -627,6 +627,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
remote_dbid)
ReplicationSlot *slot;
XLogRecPtr latestFlushPtr;
bool slot_updated = false;
+ bool overwriting_failover_slot = true; /* could be a GUC
*/
/*
* Make sure that concerned WAL is received and flushed before
syncing
@@ -654,19 +655,37 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
remote_dbid)
if ((slot = SearchNamedReplicationSlot(remote_slot->name, true)))
{
bool synced;
+ bool failover_status = remote_slot->failover;;
SpinLockAcquire(&slot->mutex);
synced = slot->data.synced;
SpinLockRelease(&slot->mutex);
- /* User-created slot with the same name exists, raise
ERROR. */
- if (!synced)
- ereport(ERROR,
+ if (!synced){
+ /*
+ * Check if we need to overwrite an existing
failover slot and
+ * if slot has the failover flag set to true
+ * and the sync_replication_slots is on,
+ * other check could be added here */
+ if (overwriting_failover_slot && failover_status &&
sync_replication_slots){
+
+ /* Get rid of a replication slot that is no
longer wanted */
+ ReplicationSlotDrop(remote_slot->name,
true);
+ ereport(WARNING,
+
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("slot \"%s\" already exists"
+ " on the standby but it
will be dropped because overwriting_failover_slot is set to true",
+ remote_slot->name));
+ return false; /* Going back to the main
loop after droping the failover slot */
+ }
+ /* User-created slot with the same name exists,
raise ERROR. */
+ else
+ ereport(ERROR,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("exiting from slot
synchronization because same"
" name slot \"%s\"
already exists on the standby",
remote_slot->name));
-
+ }
/*
* The slot has been synchronized before.
*
On Thu, Jun 12, 2025 at 4:27 PM Fabrice Chapuis <fabrice636861(at)gmail(dot)com>
wrote:
> yes of course, maybe for PG 19
>
> Regards,
> Fabrice
>
> On Thu, Jun 12, 2025 at 12:31 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
>
>> On Thu, Jun 12, 2025 at 3:53 PM Fabrice Chapuis <fabrice636861(at)gmail(dot)com>
>> wrote:
>> >
>> > However, the problem still persists: it is currently not possible to
>> perform an automatic switchover after creating a new subscription.
>> >
>> > Would it be reasonable to consider adding a GUC to address this issue?
>> > I can propose a patch in that sense if it seems appropriate.
>> >
>>
>> Yeah, we can consider that, though I don't know at this stage if GUC
>> is the only way, but I hope you understand that it will be for PG19.
>>
>> --
>> With Regards,
>> Amit Kapila.
>>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2025-07-11 15:27:29 | Re: What is a typical precision of gettimeofday()? |
Previous Message | Dilip Kumar | 2025-07-11 15:06:42 | Re: CHECKPOINT unlogged data |