Re: For standby pg_ctl doesn't wait for PM_STATUS_READY in presence of promote_trigger_file

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: ashwinstar(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: For standby pg_ctl doesn't wait for PM_STATUS_READY in presence of promote_trigger_file
Date: 2020-08-05 05:46:23
Message-ID: 20200805.144623.2008802441758871789.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

At Tue, 4 Aug 2020 12:01:45 -0700, Ashwin Agrawal <ashwinstar(at)gmail(dot)com> wrote in
> If shutdown (non hot enabled) standby and promote the standby using
> promote_trigger_file via pg_ctl start with -w (wait), currently pg_ctl
> returns as soon as recovery is started. Instead would be helpful if
> pg_ctl can wait till PM_STATUS_READY for this case, given promotion is
> requested.
>
> pg_ctl -w returns as soon as recovery is started for non hot enabled
> standby because PM_STATUS_STANDBY is written
> on PMSIGNAL_RECOVERY_STARTED. Given the intent to promote the standby
> using promote_trigger_file, it would be better to not write
> PM_STATUS_STANDBY, instead let promotion complete and return only
> after connections can be actually accepted.
>
> Seems helpful behavior for users, though I am not sure about how much
> promote_trigger_file is used with non hot enabled standbys. This is
> something which will help to solidify some of the tests in Greenplum
> hence checking interest for the same here.
>
> It's doable via below patch:

It is apparently strange that "pg_ctl start" waits for a server to
promote. Is there any reason you use that way instead of pg_ctl start
then pg_ctl promote?

> diff --git a/src/backend/postmaster/postmaster.c
> b/src/backend/postmaster/postmaster.c
> index 5b5fc97c72..c49010aa5a 100644
> --- a/src/backend/postmaster/postmaster.c
> +++ b/src/backend/postmaster/postmaster.c
> @@ -5197,6 +5197,7 @@ sigusr1_handler(SIGNAL_ARGS)
> if (CheckPostmasterSignal(PMSIGNAL_RECOVERY_STARTED) &&
> pmState == PM_STARTUP && Shutdown == NoShutdown)
> {
> + bool promote_trigger_file_exist = false;
> /* WAL redo has started. We're out of reinitialization. */
> FatalError = false;
> AbortStartTime = 0;
> @@ -5218,12 +5219,25 @@ sigusr1_handler(SIGNAL_ARGS)
> if (XLogArchivingAlways())
> PgArchPID = pgarch_start();
>
> + {
> + /*
> + * if promote trigger file exist we don't wish to
> convey
> + * PM_STATUS_STANDBY, instead wish pg_ctl -w to
> wait till
> + * connections can be actually accepted by the
> database.
> + */
> + struct stat stat_buf;
> + if (PromoteTriggerFile != NULL &&
> + strcmp(PromoteTriggerFile, "") != 0 &&
> + stat(PromoteTriggerFile, &stat_buf) == 0)
> + promote_trigger_file_exist = true;
> + }
> +
> /*
> * If we aren't planning to enter hot standby mode later,
> treat
> * RECOVERY_STARTED as meaning we're out of startup, and
> report status
> * accordingly.
> */
> - if (!EnableHotStandby)
> + if (!EnableHotStandby && !promote_trigger_file_exist)
> {
> AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS,
> PM_STATUS_STANDBY);
> #ifdef USE_SYSTEMD

Addition the above, in regards to the patch, I'm not sure it's good
thing that postmaster process gets conscious of
PromoteTriggerFile.

Maybe we could change the behavior of "pg_ctl start" to wait for
consistecy point when archive recovery runs (slightly similarly to the
case of standbys) by adding a PM-signal, say,
PMSIGNAL_CONSISTENCY_REACHED?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-08-05 06:00:17 Re: Handing off SLRU fsyncs to the checkpointer
Previous Message David Rowley 2020-08-05 05:25:25 Re: pg13dev: explain partial, parallel hashagg, and memory use