Re: Is Recovery actually paused?

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Is Recovery actually paused?
Date: 2020-11-30 09:10:08
Message-ID: CAFiTN-tU9-+a2b-Zr1ShVkEJj5F-qYRS1MFO8UCxJS=wtvpvvA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 30, 2020 at 12:17 PM Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp> wrote:

Thanks for looking into this.

> On Thu, 22 Oct 2020 20:36:48 +0530
> Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> > On Thu, Oct 22, 2020 at 7:50 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Oct 22, 2020 at 6:59 AM Kyotaro Horiguchi
> > > <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > > >
> > > > At Wed, 21 Oct 2020 11:14:24 -0400, Robert Haas <robertmhaas(at)gmail(dot)com> wrote in
> > > > > On Wed, Oct 21, 2020 at 7:16 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > > > > > One idea could be, if the recovery process is waiting for WAL and a
> > > > > > recovery pause is requested then we can assume that the recovery is
> > > > > > paused because before processing the next wal it will always check
> > > > > > whether the recovery pause is requested or not.
> > > > ..
> > > > > However, it might be better to implement this by having the system
> > > > > absorb the pause immediately when it's in this state, rather than
> > > > > trying to detect this state and treat it specially.
> > > >
> > > > The paused state is shown in pg_stat_activity.wait_event and it is
> > > > strange that pg_is_wal_replay_paused() is inconsistent with the
> > > > column.
> > >
> > > Right
> > >
> > > To make them consistent, we need to call recoveryPausesHere()
> > > > at the end of WaitForWALToBecomeAvailable() and let
> > > > pg_wal_replay_pause() call WakeupRecovery().
> > > >
> > > > I think we don't need a separate function to find the state.
> > >
> > > The idea makes sense to me. I will try to change the patch as per the
> > > suggestion.
> >
> > Here is the patch based on this idea.
>
> I reviewd this patch.
>
> First, I made a recovery conflict situation using a table lock.
>
> Standby:
> #= begin;
> #= select * from t;
>
> Primary:
> #= begin;
> #= lock t in ;
>
> After this, WAL of the table lock cannot be replayed due to a lock acquired
> in the standby.
>
> Second, during the delay, I executed pg_wal_replay_pause() and
> pg_is_wal_replay_paused(). Then, pg_is_wal_replay_paused was blocked until
> max_standby_streaming_delay was expired, and eventually returned true.
>
> I can also see the same behaviour by setting recovery_min_apply_delay.
>
> So, pg_is_wal_replay_paused waits for recovery to get paused and this works
> successfully as expected.
>
> However, I wonder users don't expect pg_is_wal_replay_paused to wait.
> Especially, if max_standby_streaming_delay is -1, this will be blocked forever,
> although this setting may not be usual. In addition, some users may set
> recovery_min_apply_delay for a large. If such users call pg_is_wal_replay_paused,
> it could wait for a long time.
>
> At least, I think we need some descriptions on document to explain
> pg_is_wal_replay_paused could wait while a time.

Ok

> Also, how about adding a new boolean argument to pg_is_wal_replay_paused to
> control whether this waits for recovery to get paused or not? By setting its
> default value to true or false, users can use the old format for calling this
> and the backward compatibility can be maintained.

So basically, if the wait_recovery_pause flag is false then we will
immediately return true if the pause is requested? I agree that it is
good to have an API to know whether the recovery pause is requested or
not but I am not sure is it good idea to make this API serve both the
purpose? Anyone else have any thoughts on this?

>
> As another comment, while pg_is_wal_replay_paused is blocking, I can not cancel
> the query. I think CHECK_FOR_INTERRUPTS() is necessary in the waiting loop.
>
>
> + errhint("Recovery control functions can only be executed during recovery.")));
>
> There are a few tabs at the end of this line.

I will fix.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message tsunakawa.takay@fujitsu.com 2020-11-30 09:13:58 RE: POC: postgres_fdw insert batching
Previous Message Ajin Cherian 2020-11-30 09:05:49 Re: [HACKERS] logical decoding of two-phase transactions