Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
Date: 2012-11-13 16:02:05
Message-ID: CAHGQGwExi29o0D8OKzKVPpXBwxqusgH=n2T0+Chxs=QMjEFUSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Tue, Nov 13, 2012 at 1:06 PM, Amit kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
> On Monday, November 12, 2012 8:23 PM Fujii Masao wrote:
> On Fri, Nov 9, 2012 at 3:03 PM, Amit Kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
>> On Thursday, November 08, 2012 10:42 PM Fujii Masao wrote:
>>> On Thu, Nov 8, 2012 at 5:53 PM, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
>>> wrote:
>>> > On Thursday, November 08, 2012 2:04 PM Heikki Linnakangas wrote:
>>> >> On 19.10.2012 14:42, Amit kapila wrote:
>>> >> > On Thursday, October 18, 2012 8:49 PM Fujii Masao wrote:
>>> >> >> Before implementing the timeout parameter, I think that it's
>>> better
>>> >> to change
>>> >> >> both pg_basebackup background process and pg_receivexlog so that
>
>>>> BTW, IIRC the walsender has no timeout mechanism during sending
>>>> backup data to pg_basebackup. So it's also useful to implement the
>>> timeout mechanism for the walsender during backup.
>>
>>> Yes, its useful, but for walsender the main problem is that it uses blocking
>>> send call to send the data.
>>> I have tried using tcp_keepalive settings, but the send call doesn't comeout
>>> incase of network break.
>>> The only way I could get it out is:
>>> change in the corresponding file /proc/sys/net/ipv4/tcp_retries2 by using
>>> the command
>> echo "8" > /proc/sys/net/ipv4/tcp_retries2
>>> As per recommendation, its value should be at-least 8 (equivalent to 100
>>> sec)
>>
>>> Do you have any idea, how it can be achieved?
>
>> What about using pq_putmessage_noblock()?
>
> I will try this, but do you know why at first place in code the blocking mode is used to send files?
> I am asking as I am little scared that it should not break any design which was initially thought of while making send of files as blocking.

I'm afraid I don't know why. I guess that using non-blocking mode complicates
the code, so in the first version of pg_basebackup the blocking mode
was adopted.

Regards,

--
Fujii Masao

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message pgmail 2012-11-13 16:31:36 BUG #7656: PL/Perl SPI_freetuptable() segfault
Previous Message Devrim GÜNDÜZ 2012-11-13 15:10:52 Re: BUG #7633: postgres92-9.2.1-1.x86_64 requires libuuid.so.16()(64bit)

Browse pgsql-hackers by date

  From Date Subject
Next Message Dimitri Fontaine 2012-11-13 16:22:43 Re: Memory leaks in record_out and record_send
Previous Message Merlin Moncure 2012-11-13 15:32:33 Re: Inadequate thought about buffer locking during hot standby replay