Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date: 2025-06-04 01:10:44
Message-ID: CAD21AoBRD5zmh8D3HW3u=O7EwWJYc8ozik5Tje4Yi2RYe3iCzg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 20, 2025 at 9:54 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, May 21, 2025 at 12:45 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Mon, May 19, 2025 at 2:05 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Sun, May 18, 2025 at 1:09 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > >
> > > > On Sat, May 10, 2025 at 7:08 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > >
> > > > >
> > > > > Can we have a parameter like immediately_reserve in
> > > > > create_logical_slot API, similar to what we have for physical slots?
> > > > > We need to work out the details, but that should address the kind of
> > > > > use case you are worried about, unless I am missing something.
> > > >
> > > > Interesting idea. One concern in my mind is that in the use case I
> > > > mentioned above, users would need to carefully manage the extra
> > > > logical slot to keep the logical decoding active. The logical decoding
> > > > is deactivated on the standby as soon as users drop all logical slots
> > > > on the primary.
> > > >
> > >
> > > Yes, but the same is true for a physical slot in the case of physical
> > > replication used via primary_slot_name parameter.
> >
> > Could you elaborate on this?
> >
>
> I am trying to correlate with the case where standby no longer needs
> physical slot due to some reason like the standby machine failure, or
> say someone uses pg_createsubscriber on standby to make it subscriber,
> etc. In such a case, user needs to manually remove the physical slot
> on primary. There is difference in both cases but the point is one may
> need to manage physical slot as well.

Thank you for clarifying this. I see your point.

> >
> > I recently had a discussion with Ashtosh at PGConf.dev regarding an
> > alternative approach: introducing a new command syntax such as "ALTER
> > SYSTEM UPDATE wal_level TO 'logical'". In his presentation[1], he
> > outlined this proposed command as a means to modify specific GUC
> > parameters synchronously. The backend executing this command would
> > manage the transition, allowing users to interrupt the process via
> > Ctrl-C if necessary. In the specific context of wal_level change, this
> > command could be designed to reject operations like "ALTER SYSTEM
> > UPDATE wal_level TO 'minimal'" with an error, effectively preventing
> > undesirable wal_level transitions to or from 'minimal'. While this
> > approach shares similarities with our previous proposal of
> > implementing a dedicated SQL function for WAL level modifications, it
> > offers a more standardized interface for users.
> >
> > Though I find merit in this proposal, I remain uncertain about its
> > implementation details and whether it represents the optimal solution
> > for online wal_level changes, particularly given that our current
> > approach of automatic WAL level adjustment appears viable.
> >
>
> Yeah, I find the idea that the presence of a logical slot will allow
> the user to enable logical decoding/replication more appealing than
> this new alternative, leaving aside the challenges of realizing it.

I've drafted this idea. Here are summary for attached two patches:

0001 patch allows us to create a logical slot without WAL reservation.

0002 patch is the main patch for dynamically enabling/disabling
logical decoding when wal_level is 'replica'. It's in PoC state and
has a lot of XXX comments. One thing I think we need to consider is
that since disabling the logical decoding needs to write a WAL record
for standbys and happens when dropping the last logical slot which
needs to write a WAL record for standbys, it's possible that we write
a WAL record in a process shutdown during the process exit (e.g.,
ReplicationSlotRelease() and ReplicationSlotCleanup() are called by
ReplicationSlotShmemExit()). It might be safe as long as we do that
during calling before_shmem_exit callback but I'm not sure there is a
chance to do that during calling on_shmem_exit callbacks. It would be
better to somehow lazily disable the logical decoding.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v1-0001-Allow-to-create-logical-slots-with-no-WAL-reserva.patch application/octet-stream 13.5 KB
v1-0002-Enable-logical-decoding-dynamically-based-on-logi.patch application/octet-stream 46.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message wenhui qiu 2025-06-04 01:30:25 Re: Add log_autovacuum_{vacuum|analyze}_min_duration
Previous Message Michael Paquier 2025-06-04 00:15:08 Re: Persist injection points across server restarts