Re: Support for N synchronous standby servers - take 2

From: Andres Freund <andres(at)anarazel(dot)de>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>
Subject: Re: Support for N synchronous standby servers - take 2
Date: 2015-07-02 19:44:58
Message-ID: 20150702194458.GH16267@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On 2015-07-02 11:50:44 -0700, Josh Berkus wrote:
> So there's two parts to this:
>
> 1. I need to ensure that data is replicated to X places.
>
> 2. I need to *know* which places data was synchronously replicated to
> when the master goes down.
>
> My entire point is that (1) alone is useless unless you also have (2).

I think there's a good set of usecases where that's really not the case.

> And do note that I'm talking about information on the replica, not on
> the master, since in any failure situation we don't have the old
> master around to check.

How would you, even theoretically, synchronize that knowledge to all the
replicas? Even when they're temporarily disconnected?

> Say you take this case:
>
> "2" : { "local_replica", "london_server", "nyc_server" }
>
> ... which should ensure that any data which is replicated is replicated
> to at least two places, so that even if you lose the entire local
> datacenter, you have the data on at least one remote data center.

> EXCEPT: say you lose both the local datacenter and communication with
> the london server at the same time (due to transatlantic cable issues, a
> huge DDOS, or whatever). You'd like to promote the NYC server to be the
> new master, but only if it was in sync at the time its communication
> with the original master was lost ... except that you have no way of
> knowing that.

Pick up the phone, compare the lsns, done.

> Given that, we haven't really reduced our data loss potential or
> improved availabilty from the current 1-redundant synch rep. We still
> need to wait to get the London server back to figure out if we want to
> promote or not.
>
> Now, this configuration would reduce the data loss window:
>
> "3" : { "local_replica", "london_server", "nyc_server" }
>
> As would this one:
>
> "2" : { "local_replica", "nyc_server" }
>
> ... because we would know definitively which servers were in sync. So
> maybe that's the use case we should be supporting?

If you want automated failover you need a leader election amongst the
surviving nodes. The replay position is all they need to elect the node
that's furthest ahead, and that information exists today.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2015-07-02 19:45:16 Re: Add checksums without --initdb
Previous Message Tom Lane 2015-07-02 19:44:04 Re: Faster setup_param_list() in plpgsql