Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date: 2025-12-01 09:14:04
Message-ID: CAJpy0uDtOPZpjF2Ve9AOY0ug31NUwqHUyTC8+nAne4wD30vjSw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 1, 2025 at 11:48 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Thu, Nov 27, 2025 at 11:00 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > On Thu, Nov 27, 2025 at 4:33 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > >
> > > A few more comments:
> > >
> > >
> > > 10)
> > > +# Promote standby3, increasing effective_wal_level to 'logical' as
> > > its wal_level
> > > +# is set to 'logical'.
> > > +$standby3->promote;
> > > +
> > > +# Check if effective_wal_level is increased to 'logical' on the
> > > cascaded standby.
> > > +$standby3->wait_for_replay_catchup($cascade);
> > > +test_wal_level($cascade, "replica|logical",
> > > + "effective_wal_level got increased to 'logical' on standby as the
> > > new primary has wal_level='logical'"
> > > +);
> > >
> > > The message is slightly confusing due to usage of both 'standby' and
> > > 'new primary'. Can we make it:
> > > effective_wal_level got increased to 'logical' as the new primary has
> > > wal_level='logical'
> > >
> >
> > Upon reconsideration, we can keep it as is. I understand the intent now.
>
> Okay.
>
> >
> > ===
> >
> > I tried to test possible race-scenarios (known to me), they seem to
> > work well, except one minor thing:
> >
> > Let's say there is slot1 present, backend1 is trying to drop slot1 and
> > backend2 is trying to create slot2.
> >
> > DisableLogicalDecodingIfNecessary() first kicks in and reaches the
> > stage where it has disabled and released the lock. Before it could
> > EmitSignal and log, EnsureLogicalDecodingEnabled() kicks in and
> > completes its execution.
> > In such a case we end up with reverse LOG messages in log file:
> >
> > 09:47:16.489 IST LOG: logical decoding is enabled upon creating a new
> > logical replication slot
> > 09:47:17.484 IST LOG: logical decoding is disabled because there are
> > no valid logical replication slots
> >
> > while logical decoding is actually enabled in the system.
> >
> > Shall we check 'if (!LogicalDecodingCtl->xlog_logical_info)' before
> > logging in DisableLogicalDecodingIfNecessary()?
> >
>
> In DisableLogicalDecodingIfNecessary(), we have (without comments):
>
> if (!LogicalDecodingCtl->xlog_logical_info || CheckLogicalSlotExists())
> {
> LogicalDecodingCtl->pending_disable = false;
> LWLockRelease(LogicalDecodingControlLock);
> return;
> }
>
> START_CRIT_SECTION();
>
> LogicalDecodingCtl->logical_decoding_enabled = false;
> write_logical_decoding_status_update_record(false);
> LogicalDecodingCtl->xlog_logical_info = false;
> LogicalDecodingCtl->pending_disable = false;
>
> LWLockRelease(LogicalDecodingControlLock);
>
> END_CRIT_SECTION();
>
> EmitProcSignalBarrier(PROCSIGNAL_BARRIER_UPDATE_XLOG_LOGICAL_INFO);
>
> ereport(LOG,
> errmsg("logical decoding is disabled because there are no
> valid logical replication slots"));
>
> Does it make sense to reorder them to the following?
>

yes, it will avoid the issue.

> START_CRIT_SECTION();
>
> LogicalDecodingCtl->logical_decoding_enabled = false;
> write_logical_decoding_status_update_record(false);
> LogicalDecodingCtl->xlog_logical_info = false;
> LogicalDecodingCtl->pending_disable = false;
>
> END_CRIT_SECTION();
>
> ereport(LOG,
> errmsg("logical decoding is disabled because there are no
> valid logical replication slots"));
>
> LWLockRelease(LogicalDecodingControlLock);
>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message li carol 2025-12-01 09:24:11 回复: UPDATE run check constraints for affected columns only
Previous Message Дмитрий Лебедев 2025-12-01 09:11:33 Re: PoC: Simplify recovery after dropping a table by LOGGING the restore LSN