Re: strange error reporting

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: strange error reporting
Date: 2021-05-03 14:47:47
Message-ID: 3792030.1620053267@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Mon, May 3, 2021 at 6:08 AM Peter Eisentraut
> <peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:
>> Throwing the socket address in there seems a bit distracting and
>> misleading, and it also pushes off the actual information very far to
>> the end. (Also, in some cases the socket path is very long, making the
>> actual information even harder to find.) By the time you get to this
>> error, you have already connected, so mentioning the server address
>> seems secondary at best.

> It feels a little counterintuitive to me too but I am nevertheless
> inclined to believe that it's an improvement. When multi-host
> connection strings are used, the server address may not be clear. In
> fact, even when they're not, it may not be clear to a new user that
> socket communication is used, and it may not be clear where the socket
> is located.

Yeah. The specific problem I'm concerned about solving here is
"I wasn't connecting to the server I thought I was", which could be
a contributing factor in almost any connection-time failure. The
multi-host-connection-string feature made that issue noticeably worse,
but surely we've all seen trouble reports that boiled down to that
even before that feature came in.

As you say, we could perhaps redesign the messages to provide this
info in another order. But it'd be difficult, and I think it might
come out even more confusing in cases where libpq tried several
servers on the way to finally failing. The old code's error
reporting for such cases completely sucked, whereas now you get
a reasonably complete trace of the attempts. As a quick example,
for a case of bad hostname followed by wrong port:

$ psql -d "host=foo1,sss2 port=5432,5342"
psql: error: could not translate host name "foo1" to address: Name or service not known
connection to server at "sss2" (192.168.1.48), port 5342 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?

v13 renders this as

$ psql -d "host=foo1,sss2 port=5432,5342"
psql: error: could not translate host name "foo1" to address: Name or service not known
could not connect to server: Connection refused
Is the server running on host "sss2" (192.168.1.48) and accepting
TCP/IP connections on port 5342?

Now, of course the big problem there is the lack of consistency about
how the two errors are laid out; but I'd argue that putting the
server identity info first is better than putting it later.

Also, if you experiment with other cases such as some of the servers
complaining about wrong user name, the old behavior is even harder
to follow about which server said what.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2021-05-03 15:03:06 Re: MaxOffsetNumber for Table AMs
Previous Message Robert Haas 2021-05-03 14:41:07 Re: MaxOffsetNumber for Table AMs