Re: Disable startup progress timeout during standby WAL replay

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: masao(dot)fujii(at)gmail(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Disable startup progress timeout during standby WAL replay
Date: 2026-06-16 08:28:31
Message-ID: 20260616.172831.1323943214650766490.horikyota.ntt@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

At Thu, 11 Jun 2026 13:39:22 +0900, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote in
> On Thu, Jun 11, 2026 at 11:58 AM Kyotaro Horiguchi
> <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > I wonder whether we need to disable the startup progress timeout in
> > EnableStandbyMode() in the first place, if the intention is only to
> > suppress progress reporting during WAL replay on a standby.
>
> There can be cases where WAL replay starts with StandbyMode == false
> and EnableStandbyMode() is called later during WAL replay. For example,
> *as far as I remember correctly*, that might happen when starting with
> standby.signal but without backup_label. In such cases, we would still want
> EnableStandbyMode() to disable the timeout.

I see the point about the standby.signal case, where WAL replay may
already be in progress when EnableStandbyMode() is reached. In fact,
that example makes me think the opposite: it seems that
EnableStandbyMode() only needs to disable the timeout in this
particular case because replay has already started before it is
called.

The other call sites do not seem to share that property, and it looks
to me as though the timeout could still be legitimately needed there
until standby replay actually begins. That is the reason behind my
earlier comment questioning whether EnableStandbyMode() is the right
place to disable it.

> > Wouldn't it be simpler to handle the standby case at the existing
> > check, like this?
> >
> > if (!StandbyMode)
> > begin_startup_progress_phase();
> > + else
> > + disable_startup_progress_timeout();
>
> I thought the same at first, too. But I just thought the timeout should not
> be enabled even while reading the first WAL record, so I placed
> the check before reading the first record.

Similarly, if the timeout should be disabled before reading the first
WAL record, I wonder whether begin_startup_progress_phase() should
also be called at the same point.

That said, I agree that the proposed change should work as it is, and
this is ultimately a trade-off against the size of the fix. So I do
not object to this direction if others think it is the more practical
approach.

Regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2026-06-16 08:54:11 RE: Fix race in ReplicationSlotRelease for ephemeral slots
Previous Message Richard Guo 2026-06-16 08:27:46 Re: assertion failure with unique index + partitioning + join