Re: PQhost may return socket dir for network connection

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PQhost may return socket dir for network connection
Date: 2017-05-01 17:13:11
Message-ID: CA+TgmoZ+9h=miD4+wYc9QztezgtLfeA59XtxVAL0NUjvfwKmaA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 1, 2017 at 12:06 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Fri, Apr 28, 2017 at 3:43 AM, Kyotaro HORIGUCHI
>> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>>> As the subject, PQhost() seems to be forgeting about the case
>>> where only hostaddr is specified in a connection string.
>
>> I suspect that may have been intentional.
>
> See commits 11003eb55 and 40cb21f70 for recent history in this area.
> Further back there's more history around host vs. hostaddr. We've
> gone back and forth on this in the past, including an ultimately
> reverted attempt to add "PQhostaddr()", so it'd be a good idea to
> study those past threads before proposing a change here.
>
> Having said that, the behavior stated in $subject does sound wrong.

I'm not sure. My understanding of the relationship between host and
hostaddr is that hostaddr overrides our notion of where to find host,
but not our notion of the host to which we're connecting. Under that
definition, the current behavior as described by Kyotaro sounds
correct. When you say host=X hostaddr=Y, we act as though we're
connecting to X but try to connect to IP Y. When you just specify
hostaddr=Y, we act as if we're trying to connect to the default host
(which happens to be a socket address in that example) but actually
use the specified IP address. That's consistent.

Now, against that,
https://www.postgresql.org/docs/9.6/static/libpq-connect.html says:

--
If hostaddr is specified without host, the value for hostaddr gives
the server network address. The connection attempt will fail if the
authentication method requires a host name.
--

And PQhost() asks for the host name, so you could argue that it too
should "fail" in this situation. But it has no way to report failure,
so that's kind of problematic.
https://www.postgresql.org/docs/9.6/static/libpq-status.html says:

--
Without either a host name or host address, libpq will connect using a
local Unix-domain socket; or on machines without Unix-domain sockets,
it will attempt to connect to localhost.
--

But that's just about where to make the connection, not what we should
consider the hostname to be for purposes other than calling
connect(2).

Kyotaro Horiguchi argues that the current behavior is "not useful" and
that's probably true, but I don't think it's the job of an API to try
as hard as possible to do something useful in every case. It's more
important that the behavior is predictable and understandable. In
short, if we're going to change the behavior of PQhost() here, then
there should be a documentation change to go with it, because the
current documentation around the interaction between host and hostaddr
does not make it at all clear that the current behavior is wrong, at
least not as far as I can see. To the contrary, it suggests that if
you use hostaddr without host, you may find the results surprising or
even unfortunate:

--
Using hostaddr instead of host allows the application to avoid a host
name look-up, which might be important in applications with time
constraints. However, a host name is required for GSSAPI or SSPI
authentication methods, as well as for verify-full SSL certificate
verification. ... Note that authentication is likely to fail if host
is not the name of the server at network address hostaddr.
--

The overall impression that the documentation leaves me with is that
you are expected to use only host unless you care about saving name
lookups; then, use both host and hostaddr; if you want to use just
hostaddr you can try it, but it'll fail to work properly if you then
try to do something that needs a host name. Calling PQhost() is,
perhaps, one such thing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2017-05-01 17:17:03 Re: transition table behavior with inheritance appears broken (was: Declarative partitioning - another take)
Previous Message Tom Lane 2017-05-01 17:02:07 Re: snapbuild woes