On Tuesday, October 09, 2012 6:00 PM Robert Haas wrote:
> On Mon, Oct 8, 2012 at 10:42 AM, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
> > How about following:
> > 1. replication_client_timeout -- shouldn't it be client as new
> > is for wal receiver
> > 2. replication_standby_timeout
> ISTM that the client and the standby are the same thing.
Yeah same, but may be one (replication_standby_timeout) can be more easily
understandable by user.
> > If we introduce a new parameter for wal receiver, wouldn't
> > replication_timeout be confusing for user?
> I actually don't think that I understand what problem we're
> trying to solve here. If the connection between the master and the
> standby is lost, shouldn't the standby realize that it's no longer
> receiving keepalives from the master and terminate the connection?
For wal receiver keepalives are also like one kind of message, so the
behavior is such that when it checks
that it doesn't receive any message, it tries to send reply/feedback message
to master after an interval of
So after every wal_receiver_status_interval, wal receiver sends a reply, but
still the socket send doesn't
fail. It fails only after many send calls as internally might be in send(),
until the sockets internal buffer is full, it keeps accumulating even if
other side recv has not received the data.
So that's the reason we decided to introduce a timeout parameter in wal
receiver similar to what we have currently in walsender.
> thought I had tested this at some point and it was working, so either
> it's subsequently gotten broken again or the scenario you're talking
> about is different in some way that I don't currently understand.
Standby takes quite longer around 15 minutes to detect whereas master is
detect quite sooner in 2-3 mins and master also mainly detects due to
timeout functionality in wal sender.
In response to
pgsql-hackers by date
|Next:||From: Amit Kapila||Date: 2012-10-09 13:42:20|
|Subject: Behavior for crash recovery when it detects a corrupt WAL record|
|Previous:||From: Albe Laurenz||Date: 2012-10-09 12:48:09|
|Subject: Re: Bad Data back Door|
pgsql-bugs by date
|Next:||From: hrtlik||Date: 2012-10-09 14:20:40|
|Subject: BUG #7590: Data corruption using pg_dump only with -Z parameter|
|Previous:||From: Robert Haas||Date: 2012-10-09 12:29:52|
|Subject: Re: [HACKERS] BUG #7534: walreceiver takes long time to detect