Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date: 2025-09-30 21:40:26
Message-ID: CAD21AoA3fEU=ZM_bsyFn4fVsz_d=uO-TQBM9tDH2DXt+o4i7KQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 30, 2025 at 1:58 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Tue, Sep 30, 2025 at 7:28 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Thu, Sep 25, 2025 at 10:43 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > > 2)
> > > As per discussion in [1], there was a proposal to implement lazily
> > > disabling decoding both in ERROR and proc-exit scenarios. But I see it
> > > only implemented in proc-exit scenario. Are we planning to do it for
> > > ERROR as well?
> >
> > After more thoughts, I realized that I missed the fact that we
> > actually wrote an ABORT record during the process shutdown.
> > ShutdownPostgres() that calls AbortOutOfAnyTransaction() is the last
> > callback in before_shmem_exit callbacks. So it's probably okay to
> > write STATUS_CHANGE record to disable logical decoding even during
> > process shutdown.
>
> Yes, that’s correct, we write as many Abort records as there are open
> transactions. And thus IMO, writing the Logical-Decoding status-change
> record, which occurs at most once, should be fine. During the disable
> process, we emit a PROCSIGNAL_BARRIER but don’t wait for responses
> from others, so this should also be acceptable. But let’s see what
> others have to say on this.

Thank you for the comment. Agreed.

>
> > As for the race condition at the end of recovery between the startup
> > process and processes updating the logical decoding status, we use
> > delay_status_change flag so that any logical decoding status change
> > initiated in the particular window (i.e., between the startup sets
> > delay_status_change and the recovery completes) has to wait for the
> > startup to complete all end-of-recovery actions. An alternative idea
> > would be that we allow processes to write STATUS_CHANGE records in the
> > particular window even during recovery, by using
> > LocalSetXLogInsertAllowed().
> >
>
> For everyone’s reference, I’m attaching the link to the race condition
> we discussed earlier: [1].
>
> To me, allowing status changes during that short window seems better
> and simpler than the previous approach of delaying them. But I do have
> one concern: Could the standby end up with an incorrect logical
> decoding status if, during the promotion (when allow_status_change is
> true), a slot is dropped causing the status to be disabled on the
> standby, but the promotion doesn’t complete? In that case, upon
> restart, since the standby remains in standby mode, it might pick up
> the changed status via checkPoint.logicalDecodingEnabled, resulting in
> logical decoding being disabled instead of enabled as it is on the
> primary.
>
> Is this a possibility? I haven’t had the chance to simulate and verify
> this scenario yet.

I'll research more failure cases but as for the case you mentioned I
believe it's safe. If the startup process fails before completing all
end-of-recovery actions during the promotion, it raises a FATAL,
leading to a server shutdown. Also, by the time when it calls
UpdateLogicalDecodingStatusEndOfRecovery() the recovery is finished
technically; it already assigned a new timeline ID, removed the signal
file, and updated the min recovery point in the control file.
Therefore, after the server restarts, it doesn't enter the standby
mode but works as the primary server with logical decoding being
disabled, which is the correct state.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-09-30 22:43:31 Re: [BUG]: the walsender does not update its IO statistics until it exits
Previous Message Andrew Dunstan 2025-09-30 20:42:35 Re: Add jsonb_translate(jsonb, from, to)