Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date: 2025-09-30 05:57:59
Message-ID: CAJpy0uAgytiQ=GHnx9_dkRxfouV7-3S+8GWAdMKSL5CZDVjMog@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 30, 2025 at 7:28 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Thu, Sep 25, 2025 at 10:43 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > On Fri, Sep 26, 2025 at 12:46 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Sep 25, 2025 at 4:57 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > > >
> > > > On Tue, Sep 23, 2025 at 3:28 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > >
> > > > >
> > > > > I've attached the updated patch. It incorporates all comments I got so
> > > > > far and implements to lazily disable logical decoding. It's used only
> > > > > when the process tries to disable logical decoding during process
> > > > > exit.
> > > > >
> > > >
> > > > I am resuming the review now. I agree with the discussion of lazily
> > > > disabling logical decoding on ERROR or process-exit for temp-slot.
> > > >
> > > > Few initial comments:
> > >
> > > Thank you for the comments!
> > >
> > > >
> > > > 1)
> > > > I see that on standby too, during proc-exit, we set 'pending_disable'.
> > > > But it never resets it, as DisableLogicalDecodingIfNecessary is no-op
> > > > on standby. And thus the checkpoint keeps on attempting to reset it
> > > > everytime. Do we even need to set it on standby?
> > > >
> > > > Logfile has repeated: 'start completing pending logical decoding
> > > > disable request'
> > >
> > > Ugh, I missed that part. I think that standbys should not delegate the
> > > deactivation to the checkpointer uless the deactivation is actually
> > > required.
> > >
> > > > 2)
> > > > + ereport(LOG,
> > > > + (errmsg("skip disabling logical decoding as during process exit")));
> > > >
> > > > 'as' not needed.
> > >
> > > I've fixed the above two points and attached the new version patch.
> > >
> >
> > Thanks.
> >
> > 1)
> > Currently, in the existing implementation, if a promotion is in
> > progress (delay_status_change = true) and, during that time, a process
> > exits (causing a temporary slot to be released), then on the standby,
> > we may end up setting pending_disable. As a result, the checkpointer
> > will have to wait for the transition to complete before it can proceed
> > with disabling logical decoding (if needed).
> >
> > a)
> > This means the checkpoint may be delayed further, depending on how
> > long it takes for all processes to respond to ProcSignalBarrier().
> >
> > b)
> > Additionally, consider the case where the promotion fails midway
> > (after UpdateLogicalDecodingStatusEndOfRecovery). If the checkpointer
> > still sees RecoveryInProgress and delay_status_change as true, could
> > it end up waiting indefinitely for the transition to complete? In my
> > testing, when promotion fails and the startup process exits, it
> > usually causes the rest of the processes, including the checkpointer,
> > to terminate as well. So, it seems that a dangling pending_disable
> > state may not actually occur on standby in practice.
> >
> > I believe scenario (b) can't really happen, but I still wanted to
> > check with you.
>
> I think so. We don't allow the system to continue starting up if the
> startup process exits with an error.

Okay.

>
> > I am not sure if (a) is a real concern — what’s your take on it?
>
> Since the startup sets delay_status_change=true after creating a
> checkpoint by PerformRecoveryXLogAction(), I thought that it would not
> be a problem in practice even if the checkpointer ends up waiting for
> the recovery to complete. On the other hand, once the process
> delegated the deactivation to the checkpointer, it would also be okay
> not to disable logical decoding at its first attempt. One required
> change would be that if the checkpointer also skips the deactivation
> when delay_status_change=true, the startup would need to wake up the
> checkpointer after completing the recovery. Otherwise, the
> checkpointer might not disable logical decoding until the next
> checkpoint time. I wanted to avoid adding this part but I'm open to
> other opinions.
>

I see your point. But I’ll skip further discussion on this for now and
want to focus on your second point first, because if that can be done,
this won’t be necessary.

> > 2)
> > As per discussion in [1], there was a proposal to implement lazily
> > disabling decoding both in ERROR and proc-exit scenarios. But I see it
> > only implemented in proc-exit scenario. Are we planning to do it for
> > ERROR as well?
>
> After more thoughts, I realized that I missed the fact that we
> actually wrote an ABORT record during the process shutdown.
> ShutdownPostgres() that calls AbortOutOfAnyTransaction() is the last
> callback in before_shmem_exit callbacks. So it's probably okay to
> write STATUS_CHANGE record to disable logical decoding even during
> process shutdown.

Yes, that’s correct, we write as many Abort records as there are open
transactions. And thus IMO, writing the Logical-Decoding status-change
record, which occurs at most once, should be fine. During the disable
process, we emit a PROCSIGNAL_BARRIER but don’t wait for responses
from others, so this should also be acceptable. But let’s see what
others have to say on this.

> As for the race condition at the end of recovery between the startup
> process and processes updating the logical decoding status, we use
> delay_status_change flag so that any logical decoding status change
> initiated in the particular window (i.e., between the startup sets
> delay_status_change and the recovery completes) has to wait for the
> startup to complete all end-of-recovery actions. An alternative idea
> would be that we allow processes to write STATUS_CHANGE records in the
> particular window even during recovery, by using
> LocalSetXLogInsertAllowed().
>

For everyone’s reference, I’m attaching the link to the race condition
we discussed earlier: [1].

To me, allowing status changes during that short window seems better
and simpler than the previous approach of delaying them. But I do have
one concern: Could the standby end up with an incorrect logical
decoding status if, during the promotion (when allow_status_change is
true), a slot is dropped causing the status to be disabled on the
standby, but the promotion doesn’t complete? In that case, upon
restart, since the standby remains in standby mode, it might pick up
the changed status via checkPoint.logicalDecodingEnabled, resulting in
logical decoding being disabled instead of enabled as it is on the
primary.

Is this a possibility? I haven’t had the chance to simulate and verify
this scenario yet.

> If we implement these ideas, we can simplify the patch quite well as
> we no longer need the lazy behavior nor wait for the recovery to
> complete.

Agree, if we can make these ideas work, the patch will be much simpler.

[1]: https://www.postgresql.org/message-id/OSCPR01MB14966C5E31CA0ACAD07AF8B5FF524A%40OSCPR01MB14966.jpnprd01.prod.outlook.com

thanks
Shveta

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2025-09-30 06:05:42 Re: GB18030-2022 Support in PostgreSQL
Previous Message BharatDB 2025-09-30 05:30:48 Re: [PATCH] Fix pg_rewind false positives caused by shutdown-only WAL