Re: Timeout and wait-forever in sync rep

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Timeout and wait-forever in sync rep
Date: 2010-10-19 05:56:32
Message-ID: AANLkTim87K06KgPBcjDp6JyKXQNOvndwubCPqARS8Yhw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 18, 2010 at 10:24 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> I mean, for example, that the server cannot detect the disconnection for
> more than 60 seconds even if the user configures the keepalive as follows.
>
>    tcp_keepalives_idle      = 10
>    tcp_keepalives_interval  = 5
>    tcp_keepalives_count     = 2

Yeah, TCP is not going to detect a broken connection that quickly.

I think there's a fundamental impedence mismatch of between the
application needs here and the design goals of TCP.

TCP is designed to work if at all possible and only generate an error
if it's unavoidable. Keepalives were controversial when they were
proposed but for the original purpose -- ensuring that long-lived
servers didn't leak connections indefinitely -- they serve they work.
The point of them was to cover the remaining cases where there was no
data in flight and therefore no way to ever detect that the connection
was dead.

TCP is only going to detect a connection as dead if it has exceeded
all the engineering limits of the network. Until then it's still
possible it'll come back and having the network layer generate an
error when it's possible the connection is still functioning would be
bad.

--
greg

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2010-10-19 06:36:07 Re: How to determine failed connection attempt due to invalid authorization (libpq)?
Previous Message Peter Eisentraut 2010-10-19 05:46:45 comments on type attributes broken in 9.1devel