Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date: 2025-11-05 23:02:00
Message-ID: CAD21AoB+xpZ-+_=Zkj_RRcGk8gexkt3Sx3eLBBeWVpUqOHGvBg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Nov 2, 2025 at 10:38 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Hi Sawada-San.
>
> Here are my review comments for v23-0001. Most of them are cosmetic
> comment suggestions.

Thank you for the comments.

>
> ======
> src/backend/access/transam/xlog.c
>
> show_effective_wal_level:
>
> 1.
> + /*
> + * During the recovery, instead of the local wal_level value, the
> + * primary's configuration is used for effective_wal_level. We need to
> + * check the shared status instead of local XLogLogicalInfo as we don't
> + * update it synchronously during the recovery.
> + */
>
> SUGGESTION (reworded)
> During recovery, effective_wal_level reflects the primary's
> configuration rather than the local wal_level value. Check the shared
> status instead of the local XLogLogicalInfo because XLogLogicalInfo is
> not updated synchronously during recovery.
>
> ======
> src/backend/replication/logical/logicalctl.c
>
> UpdateLogicalDecodingStatus:
>
> 2.
> + /* It must be called before allowing the status change globally */
> + Assert(!LogicalDecodingCtl->status_change_allowed);
>
> Reword the comment. Don't say "It".
>
> ~~~
>
> start_logical_decoding_status_change:
>
> 3.
> +/*
> + * This function does several kinds of preparation works required to start
> + * the process of logical decoding status change. If the status change is
> + * required, it sets LogicalDecodingCtl->status_change_inprogress, and
> + * returns true. Otherwise, if it's not required or not allowed (e.g., logical
> + * slots exist or during recovery) it returns false.
> + */
>
> That first sentence "kinds of preparation works" sounds strange.
>
> SUGGESTION:
> Performs preparation work required before changing the logical decoding status.
>
> ~~~
> RequestDisableLogicalDecoding:
>
> 4.
> +/*
> + * Initiate a request for disabling logical decoding.
> + *
> + * This function expects to be called after dropping or invalidating the
> + * possibly-last logical replication slot as it doesn't check the
> logical slot presence.
> + */
> +void
> +RequestDisableLogicalDecoding(void)
>
> SUGGESTION (rearranged sentence)
> This function does not verify whether logical slots exist. It should
> be called after dropping or invalidating what might be the last
> logical replication slot.
>
> ======
> src/backend/replication/slot.c
>
> ReplicationSlotRelease:
>
> 5.
> + /*
> + * Request to disable logical decoding even though this slot might
> + * have not been the last logical slot to drop. The checkpointer will
> + * check if logical decoding should be disabled afterward anyway.
> + */
> + if (is_logical)
> + RequestDisableLogicalDecoding();
>
> There are some typos "to drop" and "afterward", but maybe it's better
> just reword like below:
>
> SUGGESTION:
> Request logical decoding to be disabled, even though this slot may not
> have been the last logical slot. The checkpointer will verify whether
> logical decoding should actually be disabled.
>
> ~~~
>
> CheckLogicalSlotExists:
>
> 6.
> +bool
> +CheckLogicalSlotExists(void)
> +{
> + bool found = false;
> +
> + if (max_replication_slots <= 0)
> + return false;
> +
> + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
> + for (int i = 0; i < max_replication_slots; i++)
> + {
> + ReplicationSlot *s;
> + bool invalidated;
> +
> + s = &ReplicationSlotCtl->replication_slots[i];
> +
> + /* cannot change while ReplicationSlotCtlLock is held */
> + if (!s->in_use)
> + continue;
> +
> + if (SlotIsPhysical(s))
> + continue;
> +
> + SpinLockAcquire(&s->mutex);
> + invalidated = s->data.invalidated != RS_INVAL_NONE;
> + SpinLockRelease(&s->mutex);
> +
> + if (invalidated)
> + continue;
> +
> + found = true;
> + break;
> + }
> + LWLockRelease(ReplicationSlotControlLock);
> +
> + return found;
> +}
>
> Some inconsistent var usage. e.g. These functions have lots of
> similarities but some say 'invalidated' and some say 'is_valid'.
>

Fixed the above points.

> ~~~
>
> InvalidatePossiblyObsoleteSlot
>
> 7.
> Oid dboid, TransactionId snapshotConflictHorizon,
> - bool *invalidated)
> + bool *released_lock_out)
> ...
> - return released_lock;
> + *released_lock_out = released_lock;
> + return invalidated;
>
> Is that output parameter 'released_lock_out' really needed? IIUC you
> could do away with this parameter entirely, and just check
> LWLockHeldByMe(ReplicationSlotControlLock) back where this function
> (InvalidatePossiblyObsoleteSlot) was called?

LWLockHeldByMe() is for debugging purposes and we should not use it in this way.

>
> ~~~
>
> 8.
> + /*
> + * Request the checkpointer process to disable logical decoding if no
> + * valid logical slots exist. While the checkpointer can call this
> + * function during a checkpoint, it initiates the request but not do the
> + * actual deactivation so that it can complete the checkpointing first.
> + */
>
> typo: "but not do the"
>
> SUGGESTED (reworded)
> Request the checkpointer to disable logical decoding if no valid
> logical slots remain. If called by the checkpointer during a
> checkpoint, only the request is initiated; actual deactivation is
> deferred until after the checkpoint completes.
>

I've updated and rebased the patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v24-0001-Toggle-logical-decoding-dynamically-based-on-log.patch application/octet-stream 109.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chao Li 2025-11-05 23:21:17 Re: Optimize LISTEN/NOTIFY
Previous Message Michael Paquier 2025-11-05 22:58:55 Re: [PATCH] Fix fragile walreceiver test.