Re: [doc fix] PG10: wroing description on connect_timeout when multiple hosts are specified

From: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>
To: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, "Robert Haas (robertmhaas(at)gmail(dot)com)" <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [doc fix] PG10: wroing description on connect_timeout when multiple hosts are specified
Date: 2017-05-12 08:54:13
Message-ID: 0A3221C70F24FB45833433255569204D1F6F5924@G01JPEXMBYT05
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: pgsql-hackers-owner(at)postgresql(dot)org
> [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Tsunakawa,
> Takayuki
> I found a wrong sentence here in the doc. I'm sorry, this is what I asked
> you to add...
>
> https://www.postgresql.org/docs/devel/static/libpq-connect.html#libpq-
> paramkeywords
>
> connect_timeout
> Maximum wait for connection, in seconds (write as a decimal integer string).
> Zero or not specified means wait indefinitely. It is not recommended to
> use a timeout of less than 2 seconds. This timeout applies separately to
> each connection attempt. For example, if you specify two hosts and both
> of them are unreachable, and connect_timeout is 5, the total time spent
> waiting for a connection might be up to 10 seconds.
>
>
> The program behavior is that libpq times out after connect_timeout seconds
> regardless of how many hosts are specified. I confirmed it like this:
>
> $ export PGOPTIONS="-c post_auth_delay=30"
> $ psql -d "dbname=postgres connect_timeout=5" -h localhost,localhost -p
> 5432,5433
> (psql erros out after 5 seconds)
>
> Could you fix the doc with something like this?
>
> "This timeout applies across all the connection attempts. For example, if
> you specify two hosts and both of them are unreachable, and connect_timeout
> is 5, the total time spent waiting for a connection is up to 5 seconds."
>
> Should we also change the minimum "2 seconds" part to be longer, according
> to the number of hosts?

Instead, I think we should fix the program to match the documented behavior. Otherwise, if the first database machine is down, libpq might wait for about 2 hours (depending on the OS's TCP keepalive setting), during which it tims out after connect_timeout and does not attempt to connect to other hosts.

I'll add this item in the PostgreSQL 10 Open Items.

Regards
Takayuki Tsunakawa

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2017-05-12 08:57:55 Re: snapbuild woes
Previous Message Tsunakawa, Takayuki 2017-05-12 08:43:23 Re: [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur