Re: Is Recovery actually paused?

From: Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Is Recovery actually paused?
Date: 2021-01-14 13:18:06
Message-ID: 20210114221806.0f1c73a37f5fad91daa5e9f9@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 13 Jan 2021 17:49:43 +0530
Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:

> On Wed, Jan 13, 2021 at 3:35 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Wed, Jan 13, 2021 at 3:27 PM Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp> wrote:
> > >
> > > On Thu, 10 Dec 2020 11:25:23 +0530
> > > Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > >
> > > > > > However, I wonder users don't expect pg_is_wal_replay_paused to wait.
> > > > > > Especially, if max_standby_streaming_delay is -1, this will be blocked forever,
> > > > > > although this setting may not be usual. In addition, some users may set
> > > > > > recovery_min_apply_delay for a large. If such users call pg_is_wal_replay_paused,
> > > > > > it could wait for a long time.
> > > > > >
> > > > > > At least, I think we need some descriptions on document to explain
> > > > > > pg_is_wal_replay_paused could wait while a time.
> > > > >
> > > > > Ok
> > > >
> > > > Fixed this, added some comments in .sgml as well as in function header
> > >
> > > Thank you for fixing this.
> > >
> > > Also, is it better to fix the description of pg_wal_replay_pause from
> > > "Pauses recovery." to "Request to pause recovery." in according with
> > > pg_is_wal_replay_paused?
> >
> > Okay
> >
> > >
> > > > > > Also, how about adding a new boolean argument to pg_is_wal_replay_paused to
> > > > > > control whether this waits for recovery to get paused or not? By setting its
> > > > > > default value to true or false, users can use the old format for calling this
> > > > > > and the backward compatibility can be maintained.
> > > > >
> > > > > So basically, if the wait_recovery_pause flag is false then we will
> > > > > immediately return true if the pause is requested? I agree that it is
> > > > > good to have an API to know whether the recovery pause is requested or
> > > > > not but I am not sure is it good idea to make this API serve both the
> > > > > purpose? Anyone else have any thoughts on this?
> > > > >
> > >
> > > I think the current pg_is_wal_replay_paused() already has another purpose;
> > > this waits recovery to actually get paused. If we want to limit this API's
> > > purpose only to return the pause state, it seems better to fix this to return
> > > the actual state at the cost of lacking the backward compatibility. If we want
> > > to know whether pause is requested, we may add a new API like
> > > pg_is_wal_replay_paluse_requeseted(). Also, if we want to wait recovery to actually
> > > get paused, we may add an option to pg_wal_replay_pause() for this purpose.
> > >
> > > However, this might be a bikeshedding. If anyone don't care that
> > > pg_is_wal_replay_paused() can make user wait for a long time, I don't care either.
> >
> > I don't think that it will be blocked ever, because
> > pg_wal_replay_pause is sending the WakeupRecovery() which means the
> > recovery process will not be stuck on waiting for the WAL.

Yes, there is no stuck on waiting for the WAL. However, it can be stuck during resolving
a recovery conflict. The process could wait for max_standby_streaming_delay or
max_standby_archive_delay at most before recovery get completely paused.

Also, it could wait for recovery_min_apply_delay if it has a valid value. It is possible
that a user set this parameter to a large value, so it could wait for a long time. However,
this will be avoided by calling recoveryPausesHere() or CheckAndSetRecoveryPause() in
recoveryApplyDelay().

> > > > > > As another comment, while pg_is_wal_replay_paused is blocking, I can not cancel
> > > > > > the query. I think CHECK_FOR_INTERRUPTS() is necessary in the waiting loop.
> > >
> > > How about this fix? I think users may want to cancel pg_is_wal_replay_paused() during
> > > this is blocking.
> >
> > Yeah, we can do this. I will send the updated patch after putting
> > some more thought into these comments. Thanks again for the feedback.
> >
>
> Please find the updated patch.

Thanks. I confirmed that I can cancel pg_is_wal_repaly_paused() during stuck.

Although it is a very trivial comment, I think that the new line before
HandleStartupProcInterrupts() is unnecessary.

@@ -6052,12 +6062,20 @@ recoveryPausesHere(bool endOfRecovery)
(errmsg("recovery has paused"),
errhint("Execute pg_wal_replay_resume() to continue.")));

- while (RecoveryIsPaused())
+ while (RecoveryPauseRequested())
{
+
HandleStartupProcInterrupts();

Regards,
Yugo Nagata

--
Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2021-01-14 13:34:56 Re: ResourceOwner refactoring
Previous Message Tomas Vondra 2021-01-14 12:57:44 Re: POC: postgres_fdw insert batching