Re: how is the WAL receiver process stopped and restarted when the network connection is broken and then restored?

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Rui Hai Jiang <ruihaij(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: how is the WAL receiver process stopped and restarted when the network connection is broken and then restored?
Date: 2016-06-22 22:43:58
Message-ID: CAMsr+YE1Oc7axX=KqSUGn=wQp4gs=VLxO4WFk1YJP4E2X8yomA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 22 June 2016 at 23:52, Rui Hai Jiang <ruihaij(at)gmail(dot)com> wrote:

> Hello,
>
> I have one Primary server and one Standby server. They are doing streaming
> replication well.
>
> I did some testing. I broke the network connection between them for a few
> minutes, and then restored the network. I found the both the WAL sender and
> WAL receiver were stopped and the restarted.
>
> I wonder how WAL receiver process is stopped and restarted. I have checked
> the code hoping to find out the answer, but I don't have any clue.
>

If TCP keepalives are enabled, the TCP connection will break when the
keepalives stop arriving.

If wal receiver timeout is enabled, it'll notice that it didn't get any
data from the walsender and assume it went away.

If the OS notices that the socket went away - say, it got a TCP RST from
the remote peer as it shut down cleanly - it'll close the walreceiver
socket and the walreceiver will quit.

Otherwise it won't notice and will wait indefinitely.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-06-22 23:22:16 Re: Parallelized polymorphic aggs, and aggtype vs aggoutputtype
Previous Message David Rowley 2016-06-22 21:51:09 Re: Parallelized polymorphic aggs, and aggtype vs aggoutputtype