Re: Is Recovery actually paused?

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Is Recovery actually paused?
Date: 2021-01-21 09:59:13
Message-ID: CALj2ACXhS9oqUe2XhHhh-PRnm1=xB6M7CUrdQoS3V69Enis4gw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 19, 2021 at 9:32 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> In the last patch there were some local changes which I did not add to
> the patch and it was giving compilation warning so fixed that along
> with that I have addressed your this comment as well.

Thanks for the patch. I took a look at the v5 patch, below are some
comments. Please ignore if I'm repeating any of the comments discussed
upthread.

[1] Can we also have a wait version for pg_wal_replay_pause that waits
until recovery is actually paused right after setting the recovery
state to RECOVERY_PAUSE_REQUESTED? Something like this -
pg_wal_replay_pause_and_wait(wait boolean, wait_seconds integer
DEFAULT 60) returns boolean. It waits until either default or provided
wait_seconds and returns true if the recovery is paused within that
wait_seconds otherwise false. If wait_seconds is 0 or -1, then it
waits until recovery is paused and returns true. One advantage of this
function is that users don't need to call pg_is_wal_replay_paused().
IMHO, the job of ensuring whether or not the recovery is actually
paused, is better done by the one who requests
it(pg_wal_replay_pause/pg_wal_replay_pause_and_wait).

[2] Is it intentional that RecoveryPauseRequested() returns true even
if XLogCtl->recoveryPauseState is either RECOVERY_PAUSE_REQUESTED or
RECOVERY_PAUSED?

[3] Can we change IsRecoveryPaused() instead of RecoveryIsPaused() and
IsRecoveryPauseRequested() instead of RecoveryPauseRequested()? How
about having inline(because they have one line of code) functions like
IsRecoveryPauseRequested(), IsRecoveryPaused() and
IsRecoveryInProgress(), returning true when RECOVERY_PAUSE_REQUESTED,
RECOVERY_PAUSED and RECOVERY_IN_PROGRESS respectively?

[4] Can we have at least one line of comments for each of the new
functions, I know the function names mean everything? And also for
existing SetRecoveryPause() and GetRecoveryPauseState()?

[5] Typo, it's "every time" ---> + * this everytime.

[6] Do we reach PG_RETURN_BOOL(true); at the end of
pg_is_wal_replay_paused()? If not, is it there for satisfying the
compiler?

[7] In pg_is_wal_replay_paused(), do we need if
(!RecoveryPauseRequested()) inside the for (;;)? If yes, can we add
comments about why we need it there?

[8] Isn't it good to have
pgstat_report_wait_start(WAIT_EVENT_RECOVERY_PAUSE) and
pgstat_report_wait_end(), just before for (;;) and at the end
respectively? This will help users in knowing what they are waiting
for? Alternatively we can issue notice/warnings/write to server log
while we are waiting in for (;;) for the recovery to get paused?

[9] In pg_is_wal_replay_paused(), is 10 msec sleep time chosen
randomly or based on some analysis that the time it takes to get to
recovery paused state from request state or some other?

[10] errhint("The standby was promoted while waiting for recovery to
be paused."))); Can we know whether standby is actually promoted and
throw this error? Because the error "recovery is not in progress" here
is being thrown by just relying on if (!RecoveryInProgress()). IIUC,
using pg_promote

[11] Can we try to add tests for these functions in TAP? Currently, we
don't have any tests for pg_is_wal_replay_paused, pg_wal_replay_resume
or pg_wal_replay_pause, but we have tests for pg_promote in timeline
switch.

[12] Isn't it better to change RecoveryPauseState enum
RECOVERY_IN_PROGRESS value to start from 1 instead of 0? Because when
XLogCtl shared memory is initialized, I think recoveryPauseState can
be 0, so should it mean recovery in progress?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2021-01-21 10:14:10 Re: ResourceOwner refactoring
Previous Message Daniel Gustafsson 2021-01-21 09:42:02 Refactor SSL test framework to support multiple TLS libraries