Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date: 2025-10-23 21:53:15
Message-ID: CAD21AoAVomNF8rmVEL9nv5G0beO57hfi78Nwz3eFKfUajo+SMQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 23, 2025 at 11:52 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Thu, Oct 23, 2025 at 3:39 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > Please find a few comments:
> >
> > 1)
> > RequestDisableLogicalDecoding:
> > Shall we have 'Assert(!MyReplicationSlot)' here to ensure that this
> > function is invoked after we have released and dropped the slot. This
> > is similar to 'Assert(MyReplicationSlot)' in
> > EnsureLogicalDecodingEnabled().
>
> I think the reason why we need 'Assert(MyReplicationSlot)' in
> EnsureLogicalDecodingEnabled() is that otherwise the caller cannot get
> the expected result. IOW, suppose w can call
> EnsureLogicalDecodingEnabled() without holding a logical slot, logical
> decoding is not enabled if all logical slots are dropped concurrently
> in spite of the function name having the term "Ensure". As for
> RequestDisableLogicalDecoding(), it can request to disable logical
> decoding even while holding a logical slot. It might practically make
> less sense but it's no problem in principle. I guess it makes sense to
> put the assertion into DisableLogicalDecodingIfNecessary() instead.
>
> >
> > 2)
> > EnsureLogicalDecodingEnabled sets status_change_inprogress to false
> > before writing the WAL record. Initially, I thought this could lead to
> > a situation where another session might drop the same slot, since
> > there’s nothing preventing it (as status_change_inprogress is false
> > and LogicalDecodingControlLock has been released). This could, in
> > theory, result in the checkpointer writing the WAL record that
> > disables logical decoding before EnsureLogicalDecodingEnabled() writes
> > its WAL record that enables it — potentially causing an issue. But
> > this problem could not be reproduced in practice, since the slot was
> > acquired by session1, and therefore another session attempting to drop
> > it couldn’t acquire it. That said, I still lean towards setting
> > status_change_inprogress = false after the WAL record has been written
> > in EnsureLogicalDecodingEnabled(). Thoughts?
> > If not, we could add a comment explaining why this scenario is not a problem.
>
> Good point. I'll incorporate your suggestion.
>
> >
> > 3)
> > + # Drop the logical slot, requesting to disable logical decoding to
> > the checkpointer.
> > + # It has to wait for the recovery to complete before disabling
> > logical decoding.
> > + $standby5->safe_psql('postgres',
> > + qq[select pg_drop_replication_slot('standby5_slot');]);
> > +
> > + # Resume the startup process to complete the recovery.
> > + $standby5->safe_psql('postgres',
> > + qq[select injection_points_wakeup('startup-logical-decoding-status-change-end-of-recovery')]
> > + );
> > +
> > + $standby5->wait_for_log(
> > + "waiting for recovery completion to change logical decoding status");
> >
> > Shouldn’t we check the log for "waiting for recovery completion..."
> > before triggering injection_points_wakeup?
> >
> > IIUC, the current order may cause intermittent failures. Imagine that
> > drop-slot has not yet reached the LogicalDecodingStatusChangeAllowed
> > and RecoveryInProgress checks, and we release the injection point in
> > the meantime. In that case, drop-slot may never end up waiting, and we
> > might not see the expected log message.
>
> Agreed.
>
> I'll self-review the patch again and share the updated version patch.
>

I've addressed the above comments and made some cosmetic changes. I
think this patch is in good shape, so I am planning to push it next
week or so unless there are major review comments.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v21-0001-Enable-logical-decoding-dynamically-based-on-log.patch application/octet-stream 101.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John H 2025-10-23 22:01:32 Re: Making pg_rewind faster
Previous Message Masahiko Sawada 2025-10-23 21:47:53 Re: Support getrandom() for pg_strong_random() source