Re: Patch: add recovery_timeout option to control timeout of restore_command nonzero status code

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Alexey Vasiliev <leopard_ne(at)inbox(dot)ru>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch: add recovery_timeout option to control timeout of restore_command nonzero status code
Date: 2015-02-08 05:54:16
Message-ID: CAB7nPqQ+S91a3LRvTip4Kdvnur1PN40Au3qnBP3gEDp5BmpvWQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 6, 2015 at 4:58 PM, Fujii Masao wrote:
> timestamp.c:1708: warning: implicit declaration of function
> 'HandleStartupProcInterrupts'
>
> I got the above warning at the compilation.
>
> + pg_usleep(wait_time);
> + HandleStartupProcInterrupts();
> + total_time -= wait_time;
>
> It seems strange to call the startup-process-specific function in
> the "generic" function like TimestampSleepDifference().

I changed the sleep to a WaitLatch and moved all the logic back to
xlog.c, removing at the same the stuff in timestamp.c because it seems
strange to add a dependency with even latch there.

> * 5. Sleep 5 seconds, and loop back to 1.
>
> In WaitForWALToBecomeAvailable(), the above comment should
> be updated.

Done.

> - * Wait for more WAL to arrive. Time out after 5 seconds,
> - * like when polling the archive, to react to a trigger
> - * file promptly.
> + * Wait for more WAL to arrive. Time out after
> the amount of
> + * time specified by wal_retrieve_retry_interval, like
> + * when polling the archive, to react to a
> trigger file promptly.
> */
> WaitLatch(&XLogCtl->recoveryWakeupLatch,
> WL_LATCH_SET | WL_TIMEOUT,
> - 5000L);
> + wal_retrieve_retry_interval * 1000L);
>
> This change can prevent the startup process from reacting to
> a trigger file. Imagine the case where the large interval is set
> and the user want to promote the standby by using the trigger file
> instead of pg_ctl promote. I think that the sleep time should be 5s
> if the interval is set to more than 5s. Thought?

I disagree here. It is interesting to accelerate the check of WAL
availability from a source in some cases for replication, but the
opposite is true as well as mentioned by Alexey at the beginning of
the thread to reduce the number of requests when requesting WAL
archives from an external storage type AWS. Hence a correct solution
would be to check periodically for the trigger file with a maximum
one-time wait of 5s to ensure backward-compatible behavior. We could
reduce it to 1s or something like that as well.

> +# specifies an optional internal to wait for WAL to become available when
>
> s/internal/interval?

This part has been removed in the new part as parameter is set as a GUC.

>
> + This parameter specifies the amount of time to wait when a failure
> + occurred when reading WAL from a source (be it via streaming
> + replication, local <filename>pg_xlog</> or WAL archive) for a node
> + in standby mode, or when WAL is expected from a source.
>
> At least to me, it seems not easy to understand what the parameter is
> from this description. What about the following, for example?
> This parameter specifies the amount of time to wait when WAL is not
> available from any sources (streaming replication, local
> <filename>pg_xlog</> or WAL archive) before retrying to retrieve WAL....

OK, done.

> Isn't it better to place this parameter in postgresql.conf rather than
> recovery.conf? I'd like to change the value of this parameter without
> restarting the server.

Yes, agreed. Done so with a SIGHUP guc.
Regards,
--
Michael

Attachment Content-Type Size
20150206_wal_source_check_v6.patch text/x-diff 8.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shay Rojansky 2015-02-08 08:56:17 Re: Fetch zero result rows when executing a query?
Previous Message Amit Kapila 2015-02-08 03:36:07 Re: Parallel Seq Scan