Re: Automatic Client Failover

From: Markus Wanner <markus(at)bluegap(dot)ch>
To: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>, josh(at)agliodbs(dot)com, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Automatic Client Failover
Date: 2008-08-05 13:57:32
Message-ID: 48985C4C.20908@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Dimitri Fontaine wrote:
> I'm thinking in term of single master multiple slaves scenario...
> In single master case, each slave only needs to know who the current master is
> and if itself can process read-only queries (locally) or not.

I don't think that's as trivial as you make it sound. I'd rather put it
as: all nodes need to agree on exactly one master node at any given
point in time. However, IMO that has nothing to do with automatic client
failover.

> You seem to be thinking in term of multi-master, where the choosing of a
> master node is a different concern, as a failing master does not imply slave
> promotion.

I'm thinking about the problem which AFC tries to solve: connection
losses between the client and one of the servers (no matter if it's a
master or a slave). As opposed to a traditional single-node database,
there might be other servers available to connect to, once a client lost
the current connection (and thus suspects the server behind that
connection to have gone down).

Redirecting writing transactions from slaves to the master node solves
another problem. Being able to 'rescue' such forwarded connections in
case of a failure of the master is just a nice side effect. But it
doesn't solve the problem of connection losses between a client and the
master.

> Well, in the single master case I'm not sure to agree, but in the case of
> multi master configuration, it well seems that choosing some alive master is
> a client task.

Given a failure of the master server, how do you expect clients, which
were connected to that master server, to "failover"? Some way or
another, they need to be able to (re)connect to one of the slaves (which
possibly turned into the new master by then).

Of course, you can load that burden on the application, and simply let
that try to connect to another server upon connection failures. AFAIU
Simon is proposing to put that logic into libpq. I see merits in that
for multiple replication solutions and don't think anything exclusively
server-sided could solve the same issue (because the client currently
only has one connection to one server, which might fail at any time).

[ Please note that you still need the retry-loop in the application. It
mainly saves having to care about the list of servers and server states
in the app. ]

> Now what about multi-master multi-slave case? Does such a configuration have
> sense?

Heh.. I'm glad you are asking. ;-)

IMO the only reason for master-slave replication is ease of
implementation. It's certainly not something a sane end-users is ever
requesting by himself, because he needs that "feature". After all, not
being able to run writing queries on certain nodes is not a feature, but
a bare limitation.

In your question, you are implicitly assuming an existing multi-master
implementation. Given my reasoning, this would make an additional
master-slave replication pretty useless. Thus I'm claiming that such a
configuration does not make sense.

> It this ever becomes possible (2 active/active masters servers, with some
> slaves for long running queries, e.g.), then you may want the ACF-enabled
> connection routine to choose to connect to any master or slave in the pool,

You can do the same with multi-master replication, without any disadvantage.

> and have the slave be itself an AFC client to target some alive master.

Huh? AFC for master-slave communication? That implies that slaves are
connected to the master(s) via libpq, which I think is not such a good fit.

Regards

Markus Wanner

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Asko Oja 2008-08-05 14:04:56 Re: plan invalidation vs stored procedures
Previous Message Tom Lane 2008-08-05 13:51:17 Re: plan invalidation vs stored procedures