Skip site navigation (1) Skip section navigation (2)

Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Fujii Masao'" <masao(dot)fujii(at)gmail(dot)com>
Cc: <pgsql-bugs(at)postgresql(dot)org>, <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
Date: 2012-09-13 04:22:08
Message-ID: 003b01cd9167$5735e020$05a1a060$@kapila@huawei.com (view raw or flat)
Thread:
Lists: pgsql-bugspgsql-hackers
On Wednesday, September 12, 2012 10:15 PM Fujii Masao
On Wed, Sep 12, 2012 at 8:54 PM,  <amit(dot)kapila(at)huawei(dot)com> wrote:
>> The following bug has been logged on the website:
>>
>> Bug reference:      7534
>> Logged by:          Amit Kapila
>> Email address:      amit(dot)kapila(at)huawei(dot)com
>> PostgreSQL version: 9.2.0
>> Operating system:   Suse 10
>> Description:
>
>> 1. Both master and standby machine are connected normally,
>> 2. then you use the command: ifconfig ip down; make the network card of
>> master and standby down,
>
>> Observation
>> master can detect connect abnormal, but the standby can't detect connect
>> abnormal and show a connected channel long time.

> What about setting keepalives_xxx libpq parameters?
>
http://www.postgresql.org/docs/devel/static/libpq-connect.html#LIBPQ-PARAMKE
YWORDS

> Keepalives are not a perfect solution for the termination of connection,
but
> it would help to a certain extent. 

We have tried by enabling keepalive, but it didn't worked maybe because
walreceiver is trying to send reveiver status.
It fails in sending that after many attempts of same.

> If you need something like walreceiver-version of replication_timeout,
such feature has not been implemented yet. 
> Please feel free to implement that!

 I would like to implement such feature for walreceiver, but there is one
confusion that whether to use 
 same configuration parameter(replication_timeout) for walrecevier as for
master or introduce a new 
 configuration parameter (receiver_replication_timeout).

 The only point in having different timeout parameters for walsender and
walreceiver is for the case of standby which 
 has both walsender and walreceiver to send logs to cascaded standby, in
such case somebody might want to have different timeout parameters for
walsender and walreceiver.
 OTOH it will create confusion to have too many parameters. My opinion is to
have one timeout parameter for both walsender and walrecevier.

Let me know your suggestion/opinion about same.

Note- I am marking cc to pgsql-hackers, as it will be a feature request.

With Regards,
Amit Kapila.




In response to

pgsql-hackers by date

Next:From: Francois TigeotDate: 2012-09-13 06:30:03
Subject: SYSV shared memory vs mmap performance
Previous:From: Amit KapilaDate: 2012-09-13 04:00:24
Subject: Re: BUG #7534: walreceiver takes long time to detect n/w breakdown

pgsql-bugs by date

Next:From: bugsDate: 2012-09-13 06:39:21
Subject: BUG #7536: run arbitrary -c setup command before interaction[wishlist]
Previous:From: Amit KapilaDate: 2012-09-13 04:00:24
Subject: Re: BUG #7534: walreceiver takes long time to detect n/w breakdown

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group