Re: Is Recovery actually paused?

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Is Recovery actually paused?
Date: 2021-01-17 06:03:52
Message-ID: CAFiTN-uBeT8BotbTaAYpkRbnMeHr0y0+5V38UPVd+n-nkxrmaQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 14, 2021 at 6:49 PM Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp> wrote:
>
> On Wed, 13 Jan 2021 17:49:43 +0530
> Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> > On Wed, Jan 13, 2021 at 3:35 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Jan 13, 2021 at 3:27 PM Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp> wrote:
> > > >
> > > > On Thu, 10 Dec 2020 11:25:23 +0530
> > > > Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > > >
> > > > > > > However, I wonder users don't expect pg_is_wal_replay_paused to wait.
> > > > > > > Especially, if max_standby_streaming_delay is -1, this will be blocked forever,
> > > > > > > although this setting may not be usual. In addition, some users may set
> > > > > > > recovery_min_apply_delay for a large. If such users call pg_is_wal_replay_paused,
> > > > > > > it could wait for a long time.
> > > > > > >
> > > > > > > At least, I think we need some descriptions on document to explain
> > > > > > > pg_is_wal_replay_paused could wait while a time.
> > > > > >
> > > > > > Ok
> > > > >
> > > > > Fixed this, added some comments in .sgml as well as in function header
> > > >
> > > > Thank you for fixing this.
> > > >
> > > > Also, is it better to fix the description of pg_wal_replay_pause from
> > > > "Pauses recovery." to "Request to pause recovery." in according with
> > > > pg_is_wal_replay_paused?
> > >
> > > Okay
> > >
> > > >
> > > > > > > Also, how about adding a new boolean argument to pg_is_wal_replay_paused to
> > > > > > > control whether this waits for recovery to get paused or not? By setting its
> > > > > > > default value to true or false, users can use the old format for calling this
> > > > > > > and the backward compatibility can be maintained.
> > > > > >
> > > > > > So basically, if the wait_recovery_pause flag is false then we will
> > > > > > immediately return true if the pause is requested? I agree that it is
> > > > > > good to have an API to know whether the recovery pause is requested or
> > > > > > not but I am not sure is it good idea to make this API serve both the
> > > > > > purpose? Anyone else have any thoughts on this?
> > > > > >
> > > >
> > > > I think the current pg_is_wal_replay_paused() already has another purpose;
> > > > this waits recovery to actually get paused. If we want to limit this API's
> > > > purpose only to return the pause state, it seems better to fix this to return
> > > > the actual state at the cost of lacking the backward compatibility. If we want
> > > > to know whether pause is requested, we may add a new API like
> > > > pg_is_wal_replay_paluse_requeseted(). Also, if we want to wait recovery to actually
> > > > get paused, we may add an option to pg_wal_replay_pause() for this purpose.
> > > >
> > > > However, this might be a bikeshedding. If anyone don't care that
> > > > pg_is_wal_replay_paused() can make user wait for a long time, I don't care either.
> > >
> > > I don't think that it will be blocked ever, because
> > > pg_wal_replay_pause is sending the WakeupRecovery() which means the
> > > recovery process will not be stuck on waiting for the WAL.
>
> Yes, there is no stuck on waiting for the WAL. However, it can be stuck during resolving
> a recovery conflict. The process could wait for max_standby_streaming_delay or
> max_standby_archive_delay at most before recovery get completely paused.

Okay, I agree that it is possible so for handling this we have a
couple of options
1. pg_is_wal_replay_paused(), interface will wait for recovery to
actually get paused, but user have an option to cancel that. So I
agree that there is currently no option to just know that recovery
pause is requested without waiting for its actually get paused if it
is requested. So one option is we can provide an another interface as
you mentioned pg_is_wal_replay_paluse_requeseted(), which can just
return the request status. I am not sure how useful it is.

2. Pass an option to pg_is_wal_replay_paused whether to wait for
recovery to actually get paused or not.

3. Pass an option to pg_wal_replay_pause(), whether to wait for
recovery pause or just request and return.

I like the option 1, any other opinion on this?

> Also, it could wait for recovery_min_apply_delay if it has a valid value. It is possible
> that a user set this parameter to a large value, so it could wait for a long time. However,
> this will be avoided by calling recoveryPausesHere() or CheckAndSetRecoveryPause() in
> recoveryApplyDelay().

Right

> > > > > > > As another comment, while pg_is_wal_replay_paused is blocking, I can not cancel
> > > > > > > the query. I think CHECK_FOR_INTERRUPTS() is necessary in the waiting loop.
> > > >
> > > > How about this fix? I think users may want to cancel pg_is_wal_replay_paused() during
> > > > this is blocking.
> > >
> > > Yeah, we can do this. I will send the updated patch after putting
> > > some more thought into these comments. Thanks again for the feedback.
> > >
> >
> > Please find the updated patch.
>
> Thanks. I confirmed that I can cancel pg_is_wal_repaly_paused() during stuck.

Thanks

> Although it is a very trivial comment, I think that the new line before
> HandleStartupProcInterrupts() is unnecessary.
>
> @@ -6052,12 +6062,20 @@ recoveryPausesHere(bool endOfRecovery)
> (errmsg("recovery has paused"),
> errhint("Execute pg_wal_replay_resume() to continue.")));
>
> - while (RecoveryIsPaused())
> + while (RecoveryPauseRequested())
> {
> +
> HandleStartupProcInterrupts();
>
>

I will fix in the next version.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-01-17 06:24:57 Re: Key management with tests
Previous Message Amit Kapila 2021-01-17 05:57:48 Re: Deleting older versions in unique indexes to avoid page splits