Re: Is Recovery actually paused?

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Is Recovery actually paused?
Date: 2020-12-10 05:55:23
Message-ID: CAFiTN-tN325vACNZL+Y1JYqDTi1Egu8wawrjuyE9yxvAgdqsjw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 30, 2020 at 2:40 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Mon, Nov 30, 2020 at 12:17 PM Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp> wrote:
>
> Thanks for looking into this.
>
> > On Thu, 22 Oct 2020 20:36:48 +0530
> > Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > > On Thu, Oct 22, 2020 at 7:50 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > > >
> > > > On Thu, Oct 22, 2020 at 6:59 AM Kyotaro Horiguchi
> > > > <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > > > >
> > > > > At Wed, 21 Oct 2020 11:14:24 -0400, Robert Haas <robertmhaas(at)gmail(dot)com> wrote in
> > > > > > On Wed, Oct 21, 2020 at 7:16 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > > > > > > One idea could be, if the recovery process is waiting for WAL and a
> > > > > > > recovery pause is requested then we can assume that the recovery is
> > > > > > > paused because before processing the next wal it will always check
> > > > > > > whether the recovery pause is requested or not.
> > > > > ..
> > > > > > However, it might be better to implement this by having the system
> > > > > > absorb the pause immediately when it's in this state, rather than
> > > > > > trying to detect this state and treat it specially.
> > > > >
> > > > > The paused state is shown in pg_stat_activity.wait_event and it is
> > > > > strange that pg_is_wal_replay_paused() is inconsistent with the
> > > > > column.
> > > >
> > > > Right
> > > >
> > > > To make them consistent, we need to call recoveryPausesHere()
> > > > > at the end of WaitForWALToBecomeAvailable() and let
> > > > > pg_wal_replay_pause() call WakeupRecovery().
> > > > >
> > > > > I think we don't need a separate function to find the state.
> > > >
> > > > The idea makes sense to me. I will try to change the patch as per the
> > > > suggestion.
> > >
> > > Here is the patch based on this idea.
> >
> > I reviewd this patch.
> >
> > First, I made a recovery conflict situation using a table lock.
> >
> > Standby:
> > #= begin;
> > #= select * from t;
> >
> > Primary:
> > #= begin;
> > #= lock t in ;
> >
> > After this, WAL of the table lock cannot be replayed due to a lock acquired
> > in the standby.
> >
> > Second, during the delay, I executed pg_wal_replay_pause() and
> > pg_is_wal_replay_paused(). Then, pg_is_wal_replay_paused was blocked until
> > max_standby_streaming_delay was expired, and eventually returned true.
> >
> > I can also see the same behaviour by setting recovery_min_apply_delay.
> >
> > So, pg_is_wal_replay_paused waits for recovery to get paused and this works
> > successfully as expected.
> >
> > However, I wonder users don't expect pg_is_wal_replay_paused to wait.
> > Especially, if max_standby_streaming_delay is -1, this will be blocked forever,
> > although this setting may not be usual. In addition, some users may set
> > recovery_min_apply_delay for a large. If such users call pg_is_wal_replay_paused,
> > it could wait for a long time.
> >
> > At least, I think we need some descriptions on document to explain
> > pg_is_wal_replay_paused could wait while a time.
>
> Ok

Fixed this, added some comments in .sgml as well as in function header

> > Also, how about adding a new boolean argument to pg_is_wal_replay_paused to
> > control whether this waits for recovery to get paused or not? By setting its
> > default value to true or false, users can use the old format for calling this
> > and the backward compatibility can be maintained.
>
> So basically, if the wait_recovery_pause flag is false then we will
> immediately return true if the pause is requested? I agree that it is
> good to have an API to know whether the recovery pause is requested or
> not but I am not sure is it good idea to make this API serve both the
> purpose? Anyone else have any thoughts on this?
>
> >
> > As another comment, while pg_is_wal_replay_paused is blocking, I can not cancel
> > the query. I think CHECK_FOR_INTERRUPTS() is necessary in the waiting loop.
> >
> >
> > + errhint("Recovery control functions can only be executed during recovery.")));
> >
> > There are a few tabs at the end of this line.
>
> I will fix.

Fixed this as well.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v2-0001-pg_is_wal_replay_paused-will-wait-for-recovery-to.patch text/x-patch 7.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2020-12-10 06:24:46 Re: Parallel INSERT (INTO ... SELECT ...)
Previous Message Greg Nancarrow 2020-12-10 05:39:33 Re: Parallel INSERT (INTO ... SELECT ...)