From: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: libpq connection timeout mismanagement |
Date: | 2018-08-11 14:42:16 |
Message-ID: | alpine.DEB.2.21.1808111611040.30788@lancre |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello Tom,
> The patch that taught libpq about allowing multiple target hosts
> modified connectDBComplete() with the intent of making the
> connect_timeout (if specified) apply per-host, not to the complete
> connection attempt. It did not do a very good job though, because
> the timeout only gets reset when connectDBComplete() itself detects
> a timeout. If PQconnectPoll advances to a new host due to some
> other cause, the previous host's timeout continues to run, possibly
> causing a premature timeout failure for the new one.
>
> Another thing that I find pretty strange is that it is coded so that,
> in event of a timeout detection by connectDBComplete, we give up on the
> current connhost entry and advance to the next host, ignoring any
> additional addresses we might have for the current hostname. This seems
> at best poorly thought through. There's no good reason for libpq to
> assume that all the addresses returned by DNS point at the same machine,
> or share the same network failure points in between.
>
> Attached is an attempt to improve this. I'll park it in the Sept fest.
Patch does not "git apply", but is ok with "patch -p1". Compiles.
Global "make check" is okay.
Feature works for me, i.e. each target host seems to get at least its
timeout.
Code looks straightforward enough.
In passing:
The code suggests that timeout is always 2 or more
if (timeout < 2) timeout = 2;
but doc says "It is not recommended to use a timeout of less than 2
seconds", which is inconsistent. It should read "Timeout is always set to
at least 2 whatever the user asks.".
connect_timeout=2.9 is accepted and considered as meaning 2.
connect_timeout=-10 or connect_timeout=two are also accepted and mean
forever. Probably thanks to "atoi".
--
Fabien.
From | Date | Subject | |
---|---|---|---|
Next Message | Andreas Seltenreich | 2018-08-11 15:45:43 | [sqlsmith] ERROR: plan should not reference subplan's variable |
Previous Message | Tomas Vondra | 2018-08-11 14:18:20 | Re: logical decoding / rewrite map vs. maxAllocatedDescs |