| From: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> | 
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
| Cc: | PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> | 
| Subject: | Re: libpq connection timeout mismanagement | 
| Date: | 2018-08-11 14:42:16 | 
| Message-ID: | alpine.DEB.2.21.1808111611040.30788@lancre | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Hello Tom,
> The patch that taught libpq about allowing multiple target hosts
> modified connectDBComplete() with the intent of making the
> connect_timeout (if specified) apply per-host, not to the complete
> connection attempt.  It did not do a very good job though, because
> the timeout only gets reset when connectDBComplete() itself detects
> a timeout.  If PQconnectPoll advances to a new host due to some
> other cause, the previous host's timeout continues to run, possibly
> causing a premature timeout failure for the new one.
>
> Another thing that I find pretty strange is that it is coded so that,
> in event of a timeout detection by connectDBComplete, we give up on the
> current connhost entry and advance to the next host, ignoring any
> additional addresses we might have for the current hostname.  This seems
> at best poorly thought through.  There's no good reason for libpq to
> assume that all the addresses returned by DNS point at the same machine,
> or share the same network failure points in between.
>
> Attached is an attempt to improve this.  I'll park it in the Sept fest.
Patch does not "git apply", but is ok with "patch -p1". Compiles.
Global "make check" is okay.
Feature works for me, i.e. each target host seems to get at least its 
timeout.
Code looks straightforward enough.
In passing:
The code suggests that timeout is always 2 or more
if (timeout < 2) timeout = 2;
but doc says "It is not recommended to use a timeout of less than 2 
seconds", which is inconsistent. It should read "Timeout is always set to 
at least 2 whatever the user asks.".
connect_timeout=2.9 is accepted and considered as meaning 2. 
connect_timeout=-10 or connect_timeout=two are also accepted and mean 
forever. Probably thanks to "atoi".
-- 
Fabien.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andreas Seltenreich | 2018-08-11 15:45:43 | [sqlsmith] ERROR: plan should not reference subplan's variable | 
| Previous Message | Tomas Vondra | 2018-08-11 14:18:20 | Re: logical decoding / rewrite map vs. maxAllocatedDescs |