Re: Is Recovery actually paused?

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: dilipbalaut(at)gmail(dot)com
Cc: robertmhaas(at)gmail(dot)com, bharath(dot)rupireddyforpostgres(at)gmail(dot)com, nagata(at)sraoss(dot)co(dot)jp, sawada(dot)mshk(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, simon(at)2ndquadrant(dot)com
Subject: Re: Is Recovery actually paused?
Date: 2021-02-24 07:09:48
Message-ID: 20210224.160948.1039609951917003501.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Tue, 23 Feb 2021 12:03:32 +0530, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote in
> On Fri, Feb 12, 2021 at 3:26 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > There might be some more to say here, but those are things I notice on
> > a first read-through.
>
> Okay.

It seems to me all the suggestions are addressed in this version.

+ Request to pause recovery. A request doesn't mean that recovery stops
+ right away. If you want a guarantee that recovery is actually paused,
+ you need to check for the recovery pause state returned by
+ <function>pg_get_wal_replay_pause_state()</function>. Note that
+ <function>pg_is_wal_replay_paused()</function> returns whether a request
+ is made. While recovery is paused, no further database changes are applied.

This looks like explainig the same thing twice. ("A request doesn't
mean.." and "While recovery is paused, ...")

How about something like this?

Request to pause recovery. Server actually stops recovery at a
convenient time. This can take a few seconds after the request. If you
need to strictly guarantee that no further database change will occur,
you can check using pg_get_wal_replay_ause_state(). Note that
pg_is_wal_replay_paused() may return true before recovery actually
stops.

The patch adds two loops whth the following logic:

while (GetRecoveryPauseState() != RECOVERY_NOT_PAUSED)
{
...
ConfirmRecoveryPaused();
<wait>
}

After the renaming of the function, the following structure looks
simpler and more natural.

while (ConfirmRecoveryPaused())
{
...
<wait>
}

+ /* test for recovery pause, if user has requested the pause */
+ if (((volatile XLogCtlData *) XLogCtl)->recoveryPauseState !=

The reason for the checkpoint is to move to "paused" state in a
reasonable time. I think we need to mention that reason rather than
what is done here.

+ /* get the recovery pause state */
+ switch(GetRecoveryPauseState())
+ {
+ case RECOVERY_NOT_PAUSED:
+ state = "not paused";
+ break;
...
+ default:
+ elog(ERROR, "invalid recovery pause state");

This disables the static enum coverage check and it is not likely to
have a wrong value here, other than the case of shared memory
corruption. So we can remove the default case
here. pg_get_replication_slots() is going that direction and
apply_dispatch() is taking a slightly different way. Anyway I think
that we can take away the default case.

regard.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2021-02-24 07:14:05 Re: About to add WAL write/fsync statistics to pg_stat_wal view
Previous Message kuroda.hayato@fujitsu.com 2021-02-24 06:33:12 RE: Refactor ECPGconnect and allow IPv6 connection