Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
Date: 2012-09-13 17:27:52
Message-ID: CAHGQGwFhcg0BwOTNzWwk8cABOCKGwFdV9e=szJK+33PDFS8yUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Thu, Sep 13, 2012 at 1:22 PM, Amit Kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
> On Wednesday, September 12, 2012 10:15 PM Fujii Masao
> On Wed, Sep 12, 2012 at 8:54 PM, <amit(dot)kapila(at)huawei(dot)com> wrote:
>>> The following bug has been logged on the website:
>>>
>>> Bug reference: 7534
>>> Logged by: Amit Kapila
>>> Email address: amit(dot)kapila(at)huawei(dot)com
>>> PostgreSQL version: 9.2.0
>>> Operating system: Suse 10
>>> Description:
>>
>>> 1. Both master and standby machine are connected normally,
>>> 2. then you use the command: ifconfig ip down; make the network card of
>>> master and standby down,
>>
>>> Observation
>>> master can detect connect abnormal, but the standby can't detect connect
>>> abnormal and show a connected channel long time.
>
>> What about setting keepalives_xxx libpq parameters?
>>
> http://www.postgresql.org/docs/devel/static/libpq-connect.html#LIBPQ-PARAMKE
> YWORDS
>
>> Keepalives are not a perfect solution for the termination of connection,
> but
>> it would help to a certain extent.
>
> We have tried by enabling keepalive, but it didn't worked maybe because
> walreceiver is trying to send reveiver status.
> It fails in sending that after many attempts of same.
>
>> If you need something like walreceiver-version of replication_timeout,
> such feature has not been implemented yet.
>> Please feel free to implement that!
>
> I would like to implement such feature for walreceiver, but there is one
> confusion that whether to use
> same configuration parameter(replication_timeout) for walrecevier as for
> master or introduce a new
> configuration parameter (receiver_replication_timeout).

I like the latter. I believe some users want to set the different
timeout values,
for example, in the case where the master and standby servers are placed in
the same room, but cascaded standby is placed in other continent.

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jeff Davis 2012-09-13 17:45:34 Re: Re: Probable bug with CreateFakeRelcacheEntry (now with reproducible test case)
Previous Message Fujii Masao 2012-09-13 17:02:24 Re: BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-09-13 18:45:44 Re: WIP fix proposal for bug #6123
Previous Message Andrew Dunstan 2012-09-13 17:26:49 Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY