Proposal: Implement failover on libpq connect level.

From: Victor Wagner <vitus(at)wagner(dot)pp(dot)ru>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Proposal: Implement failover on libpq connect level.
Date: 2015-08-18 04:18:50
Message-ID: 20150818041850.GA5092@wagner.pp.ru
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackerspgsql-jdbc

Rationale
=========

Since introduction of the WAL-based replication into the PostgreSQL, it is
possible to create high-availability and load-balancing clusters.

However, there is no support for failover in the client libraries. So, only
way to provide transparent for client application failover is IP address
migration. This approach has some limitation, i.e. it requires that
master and backup servers reside in the same subnet or may not be
feasible for other reasons.

Commercial RDBMS, such as Oracle, employ more flexible approach. They
allow to specify multiple servers in the connect string, so if primary
server is not available, client library tries to connect to other ones.

This approach allows to use geographically distributed failover clusters
and also is a cheap way to implement load-balancing (which is not
possible with IP address migration).

Proposed change
===============

Allow to specify multiple hosts in the libpq connect string. Make libpq
attempt to connect to all host simultaneously or in random order
and use of the server which successfully establishes connection first.

Syntax
------

Libpq connect string can be either set of the keyword=value pairs
or an URL.

In the first form it can be just allowed to specify keyword host
multiple times.

"host=main-server host=standby1 host=standby2 port=5432 dbname=database"

In the second form host can be specified either in the first part of URL
or in the query parameters.

postgresql://user(at)host/database

postgresql:///database?host=hostname&user=username

If host is specified as a parameter, it is also possible to allow
multiple host parameters without breaking existing syntax.

postgresql:///database?host=main-server&host=standby1&host=standby2

In order to implement load-balancing clusters, additional parameters
should be added readonly=boolean and loadbalancing=boolean

Support for this syntax extensions is added to the PQconnectdb,
PQconnectdbParams, PQConnectStart and PQConnectStartParams,
but not PQsetdb/PQsetdblogin functions.

Behavoir
--------

If PQconnectdb encounters connect string with multiple hosts specified,
it attempts to establish connection with all these hosts simultaneously,
and begins to work with server which responds first, unless
loadbalancing parameter is true.

If the loadbalancing parameter is true, it tries servers sequentially in
the random order.

If the parameter readonly is false, after authenticating with server it
executes show transaction_read_only, to find out whether current
connection is to the master or to the hot standby, and connection is
considered successful only if server allows read write transactions.

This allows to have clients which write to the database and clients
which perform read-only access. Read-only clients would be load-balanced
between the master and slave servers, and read-write clients connect only to
the master (whichever server has this role at the moment of connection).

Information of the alternate servers should be stored in the PGconn structure.

Function PQreset should be able to take advantage of new syntax and
possibly open connection to the new master, if failover occurred
during lifetime of the connection.

Possible drawbacks
==================

Compatibility
-------------

Proposed patch requires no modifications to the server or protocol, and
modification of synchronous function (PQconnectdb, PQconnectdbParams)
doesn't introduce incompatible changes to the client library.

Even if connect string with multiple host would be erroneously used
with version of libpq, which do not support this feature, it is not an
error. It just use last host specified in the connect string.

There could be some compatibility problems with asynchronous connections
created with PQConnectStart functions. Problem is that we are trying
to establish several connections at once, and there are several sockets
which should be integrated into application event loop.

Even if we would try servers in some particular order (such as randomized
order during load balancing), file descriptor of socket can change during
execution PQConnectPoll, and existing applications are not prepared to it.

Performance impact
------------------

Performance impact seems to be negligible.

1. If connect string contain only one host, the only complication is the
maintenance of the data structure, which possible can hold more than
one host name. Connection process itself would not be affected.

2. If there is pure high-availability cluster, i.e. standby servers do
not accept client connections on the specified port, there is no extra
load on standby servers, and almost no (only several unsuccessful
connect calls) on client.

3. If there is load balancing cluster, there is no performance impacts
for read-only client, but each read-write client causes standby servers
to process extra connection to the point where server can report
read-only state of transaction (i.e. including SSL handshake and
postgresql authentication). Typically, in the situation where read-only
clients should be load-balanced using this feature, there are much more
read-only clients, than read-write ones. So some extra load related with
read-write connection seems to be justified by simplification of client
configuration.

--
Victor Wagner <vitus(at)wagner(dot)pp(dot)ru>

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-08-18 04:35:46 Re: WIP: SCRAM authentication
Previous Message Jeff Janes 2015-08-18 04:11:51 Re: Potential GIN vacuum bug

Browse pgsql-jdbc by date

  From Date Subject
Next Message Albe Laurenz 2015-08-18 08:32:28 Re: Proposal: Implement failover on libpq connect level.
Previous Message Dave Cramer 2015-08-15 19:22:46 Re: Patch fo select ip and/or port for outgoing connections