Re: Automatic Client Failover

From: Markus Wanner <markus(at)bluegap(dot)ch>
To: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>, josh(at)agliodbs(dot)com, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Automatic Client Failover
Date: 2008-08-05 17:09:09
Message-ID: 48988935.5000808@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

(sorry... I'm typing too fast and hitting the wrong keys... continuing
the previous mail now...)

Dimitri Fontaine wrote:
> Now, this configuration needs to be resistant to network failure of any node,

Yeah, increasing availability is the primary purpose of doing replication.

> central one included. So I don't want synchronous replication, thanks.

I do not understanding that reasoning. Synchronous replication is
certainly *more* resilient to network failures, as it does *not* loose
any data on failover.

However, you are speaking about "logs" and "stats". That certainly
sounds like data you can afford to loose during a failover, because you
can easily recreate it. And as asynchronous replication is faster,
that's why you should prefer async replication here, IMO.

> And I
> don't want multi-master either, as I WANT to forbid central to edit data from
> the servers, and to forbid servers to edit data coming from the backoffice.

Well, I'd say you are (ab)using replication as an access controlling
method. That's not quite what it's made for, but you can certainly use
it that way.

As I understand master-slave replication, a slave should be able to take
over from the master in case that one fails. In that case, the slave
must suddenly become writable and your access controlling is void.

In case you are preventing that, you are using replication only to
transfer data and not to increase availability. That's fine, but it's
quite a different use case. And something I admittedly haven't thought
about. Thanks for pointing me to this use case of replication.

We could probably combine Postgres-R (for multi-master replication) with
londiste (to transfer selected data) asynchronously to other nodes.

> Of course, if I want HA, whatever features and failure autodetection
> PostgreSQL gives me, I still need ACF.

Agreed.

> And if I get master/slave instead of
> master/master, I need STONITH and hearbeat or equivalent.

A two-node setup with STONITH has the disadvantage, that you need manual
intervention to bring up a crashed node again. (To remove the bullet
from inside its head).

I'm thus recommending to use at least three nodes for any kind of
high-availability setup. Even if the third one only serves as a quorum
and doesn't hold a replica of the data. It allows automation of node
recovery, which does not only ease administration, but eliminates a
possible source of errors.

> I was just trying to propose ideas for having those external part as easy as
> possible to get right with whatever integrated solution comes from -core.

Yeah, that'd be great.

However, ISTM that it's not quite clear, yet, what solution will get
integrated into -core.

>> Huh? AFC for master-slave communication? That implies that slaves are
>> connected to the master(s) via libpq, which I think is not such a good fit.
>
> I'm using londiste (from Skytools), a master/slaves replication solution in
> python. I'm not sure whether the psycopg component is using libpq or
> implementing the fe protocol itself, but it seems to me in any case it would
> be a candidate to benefit from Simon's proposal.

Hm.. yeah, that might be true. On the other hand, the servers in the
cluster need to keep track of their state anyway, so there's not that
much to be gained here.

Regards

Markus Wanner

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikolae Shevchenco (md) 2008-08-05 17:33:59 unable to build libpq on Win 2003 (32 bit)
Previous Message Stephen Frost 2008-08-05 16:39:42 Re: Parsing of pg_hba.conf and authentication inconsistencies