From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | ashwinstar(at)gmail(dot)com |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: For standby pg_ctl doesn't wait for PM_STATUS_READY in presence of promote_trigger_file |
Date: | 2020-08-05 05:46:23 |
Message-ID: | 20200805.144623.2008802441758871789.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello.
At Tue, 4 Aug 2020 12:01:45 -0700, Ashwin Agrawal <ashwinstar(at)gmail(dot)com> wrote in
> If shutdown (non hot enabled) standby and promote the standby using
> promote_trigger_file via pg_ctl start with -w (wait), currently pg_ctl
> returns as soon as recovery is started. Instead would be helpful if
> pg_ctl can wait till PM_STATUS_READY for this case, given promotion is
> requested.
>
> pg_ctl -w returns as soon as recovery is started for non hot enabled
> standby because PM_STATUS_STANDBY is written
> on PMSIGNAL_RECOVERY_STARTED. Given the intent to promote the standby
> using promote_trigger_file, it would be better to not write
> PM_STATUS_STANDBY, instead let promotion complete and return only
> after connections can be actually accepted.
>
> Seems helpful behavior for users, though I am not sure about how much
> promote_trigger_file is used with non hot enabled standbys. This is
> something which will help to solidify some of the tests in Greenplum
> hence checking interest for the same here.
>
> It's doable via below patch:
It is apparently strange that "pg_ctl start" waits for a server to
promote. Is there any reason you use that way instead of pg_ctl start
then pg_ctl promote?
> diff --git a/src/backend/postmaster/postmaster.c
> b/src/backend/postmaster/postmaster.c
> index 5b5fc97c72..c49010aa5a 100644
> --- a/src/backend/postmaster/postmaster.c
> +++ b/src/backend/postmaster/postmaster.c
> @@ -5197,6 +5197,7 @@ sigusr1_handler(SIGNAL_ARGS)
> if (CheckPostmasterSignal(PMSIGNAL_RECOVERY_STARTED) &&
> pmState == PM_STARTUP && Shutdown == NoShutdown)
> {
> + bool promote_trigger_file_exist = false;
> /* WAL redo has started. We're out of reinitialization. */
> FatalError = false;
> AbortStartTime = 0;
> @@ -5218,12 +5219,25 @@ sigusr1_handler(SIGNAL_ARGS)
> if (XLogArchivingAlways())
> PgArchPID = pgarch_start();
>
> + {
> + /*
> + * if promote trigger file exist we don't wish to
> convey
> + * PM_STATUS_STANDBY, instead wish pg_ctl -w to
> wait till
> + * connections can be actually accepted by the
> database.
> + */
> + struct stat stat_buf;
> + if (PromoteTriggerFile != NULL &&
> + strcmp(PromoteTriggerFile, "") != 0 &&
> + stat(PromoteTriggerFile, &stat_buf) == 0)
> + promote_trigger_file_exist = true;
> + }
> +
> /*
> * If we aren't planning to enter hot standby mode later,
> treat
> * RECOVERY_STARTED as meaning we're out of startup, and
> report status
> * accordingly.
> */
> - if (!EnableHotStandby)
> + if (!EnableHotStandby && !promote_trigger_file_exist)
> {
> AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS,
> PM_STATUS_STANDBY);
> #ifdef USE_SYSTEMD
Addition the above, in regards to the patch, I'm not sure it's good
thing that postmaster process gets conscious of
PromoteTriggerFile.
Maybe we could change the behavior of "pg_ctl start" to wait for
consistecy point when archive recovery runs (slightly similarly to the
case of standbys) by adding a PM-signal, say,
PMSIGNAL_CONSISTENCY_REACHED?
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2020-08-05 06:00:17 | Re: Handing off SLRU fsyncs to the checkpointer |
Previous Message | David Rowley | 2020-08-05 05:25:25 | Re: pg13dev: explain partial, parallel hashagg, and memory use |