Re: BUG #16508: using multi-host connection string when the first host is starting fails

From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: jv(dot)cyril(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16508: using multi-host connection string when the first host is starting fails
Date: 2020-08-14 07:51:11
Message-ID: 20200814075111.GA1193352@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Aug 13, 2020 at 06:08:14PM -0400, Tom Lane wrote:
> Noah Misch <noah(at)leadboat(dot)com> writes:
> > On Wed, Jun 24, 2020 at 08:17:44AM +0000, PG Bug reporting form wrote:
> >> I'm connection to pg10 using psql (tried with clients psql 10.11 & psql
> >> 12.2) using a connection string such as:
> >> psql 'dbname=xxxxx1,xxxxx2,xxxxx3,xxxxx4 target_session_attrs=read-write'
> >>
> >> the connection to first database (xxxxx1) fail with the error:
> >> psql.bin: FATAL: the database system is starting up
> >>
> >> which is correct according to postgres state on that machine,
> >> but then I would expect the psql tries the next server (xxxxx2) with is in
> >> the one acceptiong the connection params (target_session_attrs=read-write)
> >> instead of the error.
>
> > I agree.

> I assume that the actual test case involved a comma-separated *host*
> (or hostaddr) list, which is what drives multiple connection attempts.

Like you, I assumed that.

> It is true that if we manage to make a connection to a host, but it
> then rejects us for some reason, we just give up rather than trying
> the next host. The problem with trying to improve that is that it's
> very unclear which cases it's actually appropriate to do that for.

That is an obstacle, yes. Assume the connection string has N>=2 entries
where, typically, one is read-write and N-1 are read-only. Here's the
principle that made me agree with the bug report. It should be possible to
"pg_ctl restart" any one host without interrupting clients' ability to form a
read-only connection using the multi-host connection string. Currently, the
outcome depends on the timing within the restart sequence:

1. fails during shutdown checkpoint
2. succeeds after port closes
3. fails after port opens, during early recovery
4. succeeds after recovery permits read-only connections

That's a bad user experience. I didn't form a proposal for what to do
instead, but I doubt we already have the optimum.

> As an example, if you fat-finger the password to host 1, it's unlikely
> that silently switching our attention to host 2 would be advisable.
> At best, what you'd get is several confusing duplicate messages.

I'd be fine with the duplicate messages. Yeah, if we could divine that the
connection attempt failed due to a client typo, it would be nice to stop
there. A client can't divine that. If the PostgreSQL servers use pam or ldap
authentication, server-side trouble can cause transient authentication
failures.

I do seem to recall discussion that rejected retrying on all errors, but I
looked and didn't locate it.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2020-08-14 20:50:49 BUG #16582: Logical index corruption leading to apparent index scan infinite loop
Previous Message Tom Lane 2020-08-13 22:08:14 Re: BUG #16508: using multi-host connection string when the first host is starting fails