Re: Patch: add recovery_timeout option to control timeout of restore_command nonzero status code

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Alexey Vasiliev <leopard_ne(at)inbox(dot)ru>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch: add recovery_timeout option to control timeout of restore_command nonzero status code
Date: 2015-02-05 03:36:20
Message-ID: CAHGQGwFqA+4dc9oJjD5B0y6cbe5Ze5kwBjN+AGuKr-h6M8ZdQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 20, 2015 at 12:57 PM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> On Tue, Jan 20, 2015 at 1:56 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> On 2015-01-19 17:16:11 +0900, Michael Paquier wrote:
>>> On Mon, Jan 19, 2015 at 4:10 PM, Michael Paquier
>>> <michael(dot)paquier(at)gmail(dot)com> wrote:
>>> > On Sat, Jan 17, 2015 at 2:44 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>>> >> Not this patch's fault, but I'm getting a bit tired seeing the above open
>>> >> coded. How about adding a function that does the sleeping based on a
>>> >> timestamptz and a ms interval?
>>> > You mean in plugins, right? I don't recall seeing similar patterns in
>>> > other code paths of backend. But I think that we can use something
>>> > like that in timestamp.c then because we need to leverage that between
>>> > two timestamps, the last failure and now():
>>> > TimestampSleepDifference(start_time, stop_time, internal_ms);
>>> > Perhaps you have something else in mind?
>>> >
>>> > Attached is an updated patch.
>>
>>> Actually I came with better than last patch by using a boolean flag as
>>> return value of TimestampSleepDifference and use
>>> TimestampDifferenceExceeds directly inside it.
>>
>>> Subject: [PATCH] Add wal_availability_check_interval to control WAL fetching
>>> on failure
>>
>> I think that name isn't a very good. And its isn't very accurate
>> either.
>> How about wal_retrieve_retry_interval? Not very nice, but I think it's
>> still better than the above.
> Fine for me.
>
>>> + <varlistentry id="wal-availability-check-interval" xreflabel="wal_availability_check_interval">
>>> + <term><varname>wal_availability_check_interval</varname> (<type>integer</type>)
>>> + <indexterm>
>>> + <primary><varname>wal_availability_check_interval</> recovery parameter</primary>
>>> + </indexterm>
>>> + </term>
>>> + <listitem>
>>> + <para>
>>> + This parameter specifies the amount of time to wait when
>>> + WAL is not available for a node in recovery. Default value is
>>> + <literal>5s</>.
>>> + </para>
>>> + <para>
>>> + A node in recovery will wait for this amount of time if
>>> + <varname>restore_command</> returns nonzero exit status code when
>>> + fetching new WAL segment files from archive or when a WAL receiver
>>> + is not able to fetch a WAL record when using streaming replication.
>>> + </para>
>>> + </listitem>
>>> + </varlistentry>
>>> +
>>> </variablelist>
>>
>> Walreceiver doesn't wait that amount, but rather how long the connection
>> is intact. And restore_command may or may not retry.
> I changed this paragraph as follows:
> + This parameter specifies the amount of time to wait when a failure
> + occurred when reading WAL from a source (be it via streaming
> + replication, local <filename>pg_xlog</> or WAL archive) for a node
> + in standby mode, or when WAL is expected from a source. Default
> + value is <literal>5s</>.
> Pondering more about this patch, I am getting the feeling that it is
> not that much a win to explain in details in the doc each failure
> depending on the source from which WAL is taken. But it is essential
> to mention that the wait is done only for a node using standby mode.
>
> Btw, I noticed two extra things that I think should be done for consistency:
> - We should use as well wal_retrieve_retry_interval when waiting for
> WAL to arrive after checking for the failures:
> /*
> - * Wait for more WAL to
> arrive. Time out after 5 seconds,
> - * like when polling the
> archive, to react to a trigger
> - * file promptly.
> + * Wait for more WAL to
> arrive. Time out after
> + * wal_retrieve_retry_interval
> ms, like when polling the archive
> + * to react to a trigger file promptly.
> */
> WaitLatch(&XLogCtl->recoveryWakeupLatch,
> WL_LATCH_SET
> | WL_TIMEOUT,
> - 5000L);
> +
> wal_retrieve_retry_interval * 1L);
> -
>
> Updated patch attached.

+ TimestampDifference(start_time, stop_time, &secs, &microsecs);
+ pg_usleep(interval_msec * 1000L - (1000000L * secs + 1L * microsecs));

What if the large interval is set and a signal arrives during the sleep?
I'm afraid that a signal cannot terminate the sleep on some platforms.
This problem exists even now because pg_usleep is used, but the sleep
time is just 5 seconds, so it's not so bad. But the patch allows a user to
set large sleep time. Shouldn't we use WaitLatch or split the pg_usleep
like recoveryPausesHere() does?

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2015-02-05 03:37:03 Re: Redesigning checkpoint_segments
Previous Message Jim Nasby 2015-02-05 02:19:14 Re: Redesigning checkpoint_segments