Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
Date: 2012-09-15 18:44:19
Message-ID: CAHGQGwGWMOc-hKTyjjswzYLykAsA2+t+xpcybi6c+DkN+5dA+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Sat, Sep 15, 2012 at 4:26 PM, Amit kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
> On Saturday, September 15, 2012 11:27 AM Fujii Masao wrote:
> On Fri, Sep 14, 2012 at 10:01 PM, Amit kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
>>
>> On Thursday, September 13, 2012 10:57 PM Fujii Masao
>> On Thu, Sep 13, 2012 at 1:22 PM, Amit Kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
>>> On Wednesday, September 12, 2012 10:15 PM Fujii Masao
>>> On Wed, Sep 12, 2012 at 8:54 PM, <amit(dot)kapila(at)huawei(dot)com> wrote:
>>>>>> The following bug has been logged on the website:
>
>>>>> I would like to implement such feature for walreceiver, but there is one
>>>>> confusion that whether to use
>>>>> same configuration parameter(replication_timeout) for walrecevier as for
>>>>> master or introduce a new
>>>>> configuration parameter (receiver_replication_timeout).
>>
>>>>I like the latter. I believe some users want to set the different
>>>>timeout values,
>>>>for example, in the case where the master and standby servers are placed in
>>>>the same room, but cascaded standby is placed in other continent.
>>
>>> Thank you for your suggestion. I have implemented as per your suggestion to have separate timeout parameter for walreceiver.
>>> The main changes are:
>>> 1. Introduce a new configuration parameter wal_receiver_replication_timeout for walreceiver.
>>> 2. In function WalReceiverMain(), check if there is no communication till wal_receiver_replication_timeout, exit the walreceiver.
>>> This is same as walsender functionality.
>>
>>> As this is a feature, So I am uploading the attached patch in coming CommitFest.
>>
>>> Suggestions/Comments?
>
>> You also need to change walsender so that it periodically sends the heartbeat
>> message, like walreceiver does each wal_receiver_status_interval. Otherwise,
>> walreceiver will detect the timeout wrongly whenever there is no traffic in the
>> master.
>
> Doesn't current keepalive message from walsender will suffice that need?

No. Though the keepalive interval should be smaller than the timeout,
IIRC there is
no way to specify the keepalive interval now.

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Amit kapila 2012-09-16 06:10:43 Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
Previous Message Kevin Grittner 2012-09-15 13:10:21 Re: BUG #7540: Hello. Need Help!

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2012-09-15 19:00:13 Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
Previous Message Bruce Momjian 2012-09-15 18:06:03 Re: pg_upgrade from 9.1.3 to 9.2 failed