Re: walsender doesn't send keepalives when writes are pending

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: walsender doesn't send keepalives when writes are pending
Date: 2014-02-14 13:25:55
Message-ID: 20140214132555.GN4910@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-02-14 13:58:59 +0100, Andres Freund wrote:
> On 2014-02-14 12:55:06 +0000, Greg Stark wrote:
> > On Fri, Feb 14, 2014 at 12:05 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > > There's no reason not
> > > to ask for a ping when we're writing.
>
> > Is there a reason to ask for a ping? The point of keepalives is to
> > ensure there's some traffic on idle connections so that if the
> > connection is dead it doesn't linger forever and so that any on-demand
> > links (or more recently NAT routers or stateful firewalls) don't time
> > out and disconnect and have to reconnect (or more recently just fail
> > outright).
>
> This ain't TCP keepalives. The reason is that we want to kill walsenders
> if they haven't responded to a ping inside wal_sender_timeout. That's
> rather important e.g. for sychronous replication, so we can quickly fall
> over to the next standby. In such scenarios you'll usually want a
> timeout *far* below anything TCP provides.

walreceiver sends pings everytime it receives a 'w' message, so it's
probably not an issue there, but pg_receivexlog/basebackup don't; they
use their own configured intervarl. So this might be an explanation of
the latter two being disconnected too early. I've seen reports of
that...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Florian Pflug 2014-02-14 14:03:16 Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease
Previous Message Heikki Linnakangas 2014-02-14 13:21:34 Re: [BUG] Archive recovery failure on 9.3+.