Re: beta3 & the open items list

From: Florian Pflug <fgp(at)phlo(dot)org>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, jd(at)commandprompt(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: beta3 & the open items list
Date: 2010-06-20 23:42:00
Message-ID: 316B4E65-6192-4722-BC6E-0373F090FEAD@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Jun 21, 2010, at 0:13 , Greg Stark wrote:
>> Keepalive is therefore extremely unlikely to break things - in the very worst case, a (really, really stupid) firewall might decide to drop packets with zero bytes of payload, causing inactive connections to abort after a while. AFAIK walreceiver will simply reconnect in this case.
>
> Stateful firewalls whole raison-d'etre is to block packets which
> aren't consistent with the current TCP state -- such as packets with a
> sequence number earlier than the last acked sequence number.
> Keepalives do in fact violate the basic TCP spec so they wouldn't be
> entirely crazy to block them.

Keepalives play games with the spec, but they don't outright violate it I'd say. The sender bluffs by retransmitting data it *knows* has been ACK'ed. But since nobody else can prove with certainty that the sender actually saw that ACK (think NIC-internal buffer overflow), nobody is able to call that bluff.

> Of course a firewall that blocked them
> would be pretty criminally stupid given how ubiquitous they are.

Very true, and another reason to stop worrying about possibly brain-dead firewalls.

>> Plus, the postmaster enables keepalive on all incoming connections
>> *already*, so any problems ought to have caused bugreports about
>> dropped client connections.
>
> Really? Since when? I thought there was some discussion about this
> about a year ago and I made it very clear this had to be an optional
> feature which defaulted to off.

Since 'bout 10 years. The setsockopt call is in StreamConnection() in src/backend/libpq/pqcomm.c.

Here's the corresponding commit:

commit 5aa160abba32a1f2d7818b9f49213f38c99b3fd8
Author: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Date: Sat May 20 13:10:54 2000 +0000

Add KEEPALIVE option to the socket of backend. This will automatically
terminate the backend that has no frontend anymore.

> Keepalives introduce spurious disconnections in working TCP
> connections that have transient outages which is basic TCP
> functionality that's supposed to work. There are cases where that's
> what you want but it isn't the kind of thing that should be on by
> default, let alone on unconditionally.

I'd buy that if all timeouts and retry counts would default to +infinity. But they don't, and hence sufficiently long network outages *will* cause connection aborts anyway. That a particular connection might survive due to inactivity proves nothing, since whether the connection is active or inactive during an outage is usually outside of anyone's control.

I really fail to see why anyone would prefer connections (and therefore transactions!) getting stuck forever over a few spurious disconnects. The former always require manual intervention and cause all sorts of performance and disk-space issues, while the latter won't even be an issue for well-written clients who just reconnect and retry.

best regards,
Florian Pflug

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message mac_man2005@hotmail.it 2010-06-21 00:39:03 Re: About tapes
Previous Message Kevin Grittner 2010-06-20 23:09:44 Re: beta3 & the open items list