Re: recovery_target_action=pause with confusing hint

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: "movead(dot)li(at)highgo(dot)ca" <movead(dot)li(at)highgo(dot)ca>, Sergei Kornilov <sk(at)zsrv(dot)org>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: recovery_target_action=pause with confusing hint
Date: 2020-04-01 07:22:03
Message-ID: d81868d1-6ab3-4d9b-f608-1b939aee376c@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/04/01 11:42, movead(dot)li(at)highgo(dot)ca wrote:
>
>>> When I test the patch, I find an issue: I start a stream with 'promote_trigger_file'
>> > GUC valid, and exec pg_wal_replay_pause() during recovery and as below it
> >> shows success to pause at the first time. I think it use a initialize
> >> 'SharedPromoteIsTriggered' value first time I exec the pg_wal_replay_pause().
>>hm. Are you sure this is related to this patch? Could you explain the exact timing? I mean log_statement = all and relevant logs.
>>Most likely this is expected change by https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=496ee647ecd2917369ffcf1eaa0b2cdca07c8730
>>My proposal does not change the behavior after this commit, only changing the lines in the logs.
> I test it again with (92d31085e926253aa650b9d1e1f2f09934d0ddfc), and the
> issue appeared again. Here is my test method which quite simple:
> 1. Setup a base backup by pg_basebackup.
> 2. Insert lots of data in master for the purpose I have enough time to exec
>    pg_wal_replay_pause() when startup the replication.
> 3. Configure the 'promote_trigger_file' GUC and create the trigger file.
> 4. Start the backup(standby), connect it immediately, and exec pg_wal_replay_pause()
> Then it appears, and a test log attached.
>
> I means when I exec the pg_wal_replay_pause() first time, nobody has check the trigger state
> by CheckForStandbyTrigger(), it use a Initialized 'SharedPromoteIsTriggered' value.
> And patch attached can solve the issue.

Thanks for the explanation!

But, sorry,,, I failed to understand the issue that you reported, yet...
You mean that the first call of pg_wal_replay_pause() in the step #2
should check whether the trigger file exists or not? If so, could you
tell me why we should do that?

BTW, right now only the startup process is allowed to call
CheckForStandbyTrigger(). So the backend process calling
pg_wal_replay_pause() and PromoteIsTriggered() is not allowed to call
CheckForStandbyTrigger(). The current logic is that the startup process
is responsible for checking the trigger file and set the flag in the shmem
if promotion is triggered. Then other processes like backend know
whether promotion is ongoing or not from the shmem. So basically
the backend doesn't need to check the trigger file itself.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jürgen Purtz 2020-04-01 07:34:41 Re: Add A Glossary
Previous Message Vik Fearing 2020-04-01 07:11:51 Re: Tab completion for \gx