Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date: 2025-09-17 11:19:31
Message-ID: CAA4eK1+rGPaburrJi+a7xMCcaKtG=HDb9cppKxWCWpgB6CkcDw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 16, 2025 at 11:49 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Tue, Sep 16, 2025 at 1:30 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > When user is dropping a temporary slot, we should disable the
> > decoding. The lazy behaviour should be for ERROR or session_exit
> > cases.
>
> I think it might be worth discussing whether to use lazy behavior in
> all cases.
>

Agreed.

> There are several advantages:
>
> - It mitigates the risk of connection timeouts during a logical slot
> drop or a subscription drop.
> - In scenarios involving frequent creation and deletion of logical
> slots (such as during initial data synchronization), it could
> potentially avoid the issue of a frequent switch on and off.
>
> On the other hand, drawbacks are:
>
> - users would have to wait for effective_wal_level to get decreased to
> 'replica' somehow.
> - makes the checkpointer more busy in addition to its checkpointing job.
> - it could take a longer time to disable logical decoding if the
> checkpoint is busy with a checkpointing job.
>

This last point in drawback could hurt performance of systems for a
longer time when that was really not required. It should be okay to
use lazy behavior in all cases when we can do that in a predictable
time. The other background process to consider doing lazy processing
is the launcher whose role is to launch apply workers for subscription
and maintain a conflict_slot (if required). Now, because disabling
logical_info could also take longer time in worst cases, the
launcher's own tasks can become unpredictable. Also, if tomorrow, we
decide to support dynamically changing wal_level from minimal to some
upper level, the launcher won't be the appropriate process.

The other idea could be to have a new auxiliary process to disable
logical_info lazily. It is arguable if we just have a separate process
for this purpose but we have previously discussed some other tasks for
such a process like removal of old_serialized_snapshots and
old_logical_ rewrite_map files. See [1]. If we agree to have a
separate process for this purpose then disabling logical_info in all
cases sounds okay to me.

[1] - https://www.postgresql.org/message-id/20230217234344.GA3357392%40nathanxps13

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2025-09-17 11:25:11 Re: Parallel heap vacuum
Previous Message Daniel Gustafsson 2025-09-17 11:07:56 Re: PG 18 release notes draft committed