Re: [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur
Date: 2017-05-15 18:28:17
Message-ID: 23238.1494872897@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Sun, May 14, 2017 at 9:50 PM, Tsunakawa, Takayuki
> <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com> wrote:
>> By the way, could you elaborate what problem could occur if my solution is applied? (it doesn't seem easy for me to imagine...)

> Sure. Imagine that the user thinks that 'foo' and 'bar' are the
> relevant database servers for some service and writes 'dbname=quux
> host=foo,bar' as a connection string. However, actually the user has
> made a mistake and 'foo' is supporting some other service entirely; it
> has no database 'quux'; the database servers which have database
> 'quux' are in fact 'bar' and 'baz'.

Even more simply, suppose that your userid is known to host bar but the
DBA has forgotten to create it on foo. This is surely a configuration
error that ought to be rectified, not just failed past, or else you don't
have any of the redundancy you think you do.

Of course, the user would have to try connections to both foo and bar
to be sure that they're both configured correctly. But he might try
"host=foo,bar" and "host=bar,foo" and figure he was OK, not noticing
that both connections had silently been made to bar.

The bigger picture here is that we only want to fail past transient
errors, not configuration errors. I'm willing to err in favor of
regarding doubtful cases as transient, but most server login rejections
aren't for transient causes.

There might be specific post-connection errors that we should consider
retrying; "too many connections" is an obvious case.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2017-05-15 18:34:36 Re: proposal - using names as primary names of plpgsql function parameters instead $ based names
Previous Message Tom Lane 2017-05-15 18:13:56 Re: Re: [doc fix] PG10: wroing description on connect_timeout when multiple hosts are specified