Re: [HACKERS] Patch: add recovery_timeout option to control timeout of restore_command nonzero status code

From: Alexey Vasiliev <leopard_ne(at)inbox(dot)ru>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Patch: add recovery_timeout option to control timeout of restore_command nonzero status code
Date: 2014-12-30 12:10:21
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Thanks, I understand, what look in another part of code. Hope right now I did right changes.

To not modify current pg_usleep calculation, I changed restore_command_retry_interval value to seconds (not milliseconds). In this case, min value - 1 second.

Mon, 29 Dec 2014 00:15:03 +0900 от Michael Paquier <michael(dot)paquier(at)gmail(dot)com>:
>On Sat, Dec 27, 2014 at 3:42 AM, Alexey Vasiliev < leopard_ne(at)inbox(dot)ru > wrote:
>> Thanks for suggestions.
>> Patch updated.
>Cool, thanks. I just had an extra look at it.
>+ This is useful, if I using for restore of wal logs some
>+ external storage (like AWS S3) and no matter what the slave database
>+ will lag behind the master. The problem, what for each request to
>+ AWS S3 need to pay, what is why for N nodes, which try to get next
>+ wal log each 5 seconds will be bigger price, than for example each
>+ 30 seconds.
>I reworked this portion of the docs, it is rather incorrect as the
>documentation should not use first-person subjects, and I don't
>believe that referencing any commercial products is a good thing in
>this context.
>+# specifies an optional timeout after nonzero code of restore_command.
>+# This can be useful to increase/decrease number of a restore_command calls.
>This is still referring to a timeout. That's not good. And the name of
>the parameter at the top of this comment block is missing.
>+static int restore_command_retry_interval = 5000L;
>I think that it would be more adapted to set that to 5000, and
>multiply by 1L. I am also wondering about having a better lower bound,
>like 100ms to avoid some abuse with this feature in the retries?
>+ ereport(ERROR,
>+ errmsg("\"%s\" must
>be bigger zero",
>+ "restore_command_retry_interval")));
>I'd rather rewrite that to "must have a strictly positive value".
>- * Wait for more WAL to
>arrive. Time out after 5 seconds,
>+ * Wait for more WAL to
>arrive. Time out after
>+ *
>restore_command_retry_interval (5 seconds by default),
>                                         * like when polling the
>archive, to react to a trigger
>                                         * file promptly.
>                                         */
>                                        WaitLatch(&XLogCtl->recoveryWakeupLatch,
>                                                          WL_LATCH_SET
>- 5000L);
>I should have noticed earlier, but in its current state your patch
>actually does not work. What you are doing here is tuning the time
>process waits for WAL from stream. In your case what you want to
>control is the retry time for a restore_command in archive recovery,
>Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
>To make changes to your subscription:

Alexey Vasiliev

Attachment Content-Type Size
=?UTF-8?B?MDAwMS1hZGQtcmVjb3ZlcnlfdGltZW91dC10by1jb250cm9sbC10aW1lb3V0?= =?UTF-8?B?LWJldHdlZW4tcmVzLnBhdGNo?= application/x-patch 4.2 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2014-12-30 12:23:38 Re: Compression of full-page-writes
Previous Message Tatsuo Ishii 2014-12-30 12:04:00 Re: Coverity and pgbench