Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date: 2025-11-11 10:33:46
Message-ID: CAA4eK1L2p04cVX0dVqWBip_UPUGcK33U3Lb71K_heAmac43_cw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 11, 2025 at 9:35 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> Thanks for the patch. Please find a few comments:
>
>
> 1)
> ReplicationSlotsDropDBSlots:
>
> + SpinLockAcquire(&s->mutex);
> + invalidated = s->data.invalidated == RS_INVAL_NONE;
> + SpinLockRelease(&s->mutex);
> +
> + /*
> + * Count slots on other databases too so we can disable logical
> + * decoding only if no slots in the cluster.
> + */
> + if (invalidated)
> + n_valid_logicalslots++;
>
>
> This seems confusing to me. Can we instead do:
>
> SpinLockAcquire(&s->mutex);
> if (s->data.invalidated == RS_INVAL_NONE)
> n_valid_logicalslots++;
> SpinLockRelease(&s->mutex);
>

Or we can name the variable as slot_valid. Here, the name of the
variable invalidated sounds inverse of its purpose.

Few other comments:
1.
It should be
+ * called after dropping or invalidating what might be the last logical
+ * replication slot.
+ */
+void
+RequestDisableLogicalDecoding(void)


+ * Request to disable logical decoding, even though this slot may not
+ * have been the last logical slot. The checkpointer will verify if
+ * logical decoding should actually be disabled.
+ */
+ if (is_logical)
+ RequestDisableLogicalDecoding();

The above two comments don't match with each other. The first one
seems to suggest that RequestDisableLogicalDecoding() should be called
for the last logical slot whereas the caller says it could be called
for non-last slots as well.

2. The comments atop ReplicationSlotCleanup() and
ReplicationSlotsDropDBSlots() don't reflect recent changes in those
functions.

3.
+ consistency. Conversely, if <varname>wal_level</varname> set to

/set to/is set to

4.
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -51,6 +51,7 @@
#include "postmaster/bgwriter.h"
#include "postmaster/interrupt.h"
#include "replication/syncrep.h"
+#include "replication/logicalctl.h"

Here, the include ordering is incorrect (logicalctl.h should be before
syncrep.h).

5.
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/proc.h"
+#include "storage/procarray.h"
+#include "replication/logicalctl.h"
+#include "replication/slot.h"

Here, the include ordering is incorrect (replication should be before storage).

6.
+ /*
+ * This flag indicates whether logical decoding status changes are
+ * allowed. It is false during recovery and becomes true when recovery
+ * ends. Even when true, it specifically means "allowed after recovery has
+ * fully completed".
+ *
+ * This flag helps prevent race conditions with the startup process's
+ * end-of-recovery actions. After the startup process updates the logical
+ * decoding status at recovery end, other processes might attempt to
+ * toggle logical decoding before recovery fully completes (i.e.,
+ * RecoveryInProgress() returns false) - a period when WAL writes are
+ * still not permitted. Therefore, when this flag is true, we must wait
+ * for recovery to fully complete before attempting an activation or a
+ * deactivation.
+ */
+ bool status_change_allowed;

This is a good explanation of this flag but it would be better if we
could additionally explain why we can't rely on end-of-recovery to
allow toggling of logical decoding status? I think some reference or
mention of the same is required in the below place.

+/*
+ * Update the logical decoding status at end of the recovery. This function
+ * must be called before accepting writes.
+ */
+void
+UpdateLogicalDecodingStatusEndOfRecovery(void)

7.
If
+ * logical decoding was enabled before the last shutdown, it remains
+ * enabled as we might have set wal_level='logical' or have a few logical
+ * slots.
+ */
+ UpdateLogicalDecodingStatus(last_status, false);

How about making the last part of the comment a bit more precise by
changing it to the following?
/or have a few logical slots./or have at least one logical slot.

8.
+start_logical_decoding_status_change(bool new_status)
{

+ if (!new_status && CheckLogicalSlotExists())
+ {
+ LogicalDecodingCtl->pending_disable = false;
+ LWLockRelease(LogicalDecodingControlLock);
+ return false;
+ }


+ /* Return if we don't need to change the status */
+ if (LogicalDecodingCtl->logical_decoding_enabled == new_status)
+ {
+ LogicalDecodingCtl->pending_disable = false;
+ LWLockRelease(LogicalDecodingControlLock);
+ return false;
+ }

Checking the existence of logical slots could be costlier, so I think
it would be better to first check whether we really need to change the
status.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message wenhui qiu 2025-11-11 10:40:04 Re: Report oldest xmin source when autovacuum cannot remove tuples
Previous Message Dilip Kumar 2025-11-11 10:26:56 Re: Proposal: Conflict log history table for Logical Replication