Re: Patch: Implement failover on libpq connect level.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter van Hardenberg <pvh(at)pvh(dot)ca>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch: Implement failover on libpq connect level.
Date: 2016-10-24 15:57:27
Message-ID: CA+TgmoY7GbhB5QMuCF46toJ==soXEU4oWkDdzuaC+oCi9gQepg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-jdbc

On Wed, Oct 19, 2016 at 7:26 PM, Peter van Hardenberg <pvh(at)pvh(dot)ca> wrote:
> Supporting different ports on different servers would be a much appreciated
> feature (I can't remember if it was Kafka or Cassandra that didn't do this
> and it was very annoying.)
>
> Remember, as the connection string gets more complicated, psql supports the
> Postgres URL format as a single command-line argument and we may want to
> begin encouraging people to use that syntax instead.

While I was experimenting with this today, I discovered a problem of
interpretation related to IPv6 addresses. Internally, a postgresql://
URL and a connection string are converted into the same format, so
postgresql://a,b/ means just the same thing as host=a,b. I thought
that could similarly decide that postgresql://a:123,b:456/ is going to
get translated to host=a:123,b:456 and then there can be further code
to parse that into a list of host-and-port pairs. However, if you do
that, then something like host=1:2:3::4:5:6 is fundamentally
ambiguous. That :6 is equally valid either as part of the IP address
or as a trailing port number specification, and there's no a priori
way to know which it is. Today, since the host part can't include a
port specifier, it's regarded as part of the IP address, and I think
it would probably be a bad idea to change that, as I believe Victor's
patch would. He seems to have it in mind that we could allow things
like host=[1:2:3::4:5:6] or host=[1:2:3::4:5]:6, which would might be
helpful for the future but doesn't avoid changing the meaning of
connection strings that work today.

So now I think that to make this work correctly, we're going to need
to change both the URL parser and also add parsing for the host and
port. Let's say the user says this:

postgresql://[1::2]:3,[4::5],[6::7]::8/

What I think we need to do is translate that into this:

host=1::2,4::5,6::7 port=3,,8

Note the double-comma, indicating a blank port number for the second
URL component. When we parse those host and port strings, we can
match up each host with the corresponding port number again. Of
course, the user could also skip the URL format and directly specify a
connection string. And then they might write one where the host and
port parts don't have the same number of components, like this:

host=a,b,c port=3,4
or
host=a,b port=3,4,5

It is obvious what is meant if multiple hosts are given but only a
single port number is specified; it is also obvious what is meant if
the number of ports is equal to the number of hosts. It is not
obvious what it means if there are multiple ports but the number
doesn't equal the number of hosts. I think we can either treat that
case as an error or else do the following: if there are extra port
specifiers, ignore them; if there are extra host specifiers, use the
last port in the list for all of the remaining hosts.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-10-24 16:04:48 Re: FSM corruption leading to errors
Previous Message Petr Jelinek 2016-10-24 15:49:13 Re: [PATCH] Logical decoding timeline following take II

Browse pgsql-jdbc by date

  From Date Subject
Next Message Victor Wagner 2016-10-24 19:38:57 Re: Patch: Implement failover on libpq connect level.
Previous Message Dave Cramer 2016-10-24 11:12:31 Re: [RFC] How about changing the default value of defaultRowFetchSize?