Re: Synch failover WAS: Support for N synchronous standby servers - take 2

From: Andres Freund <andres(at)anarazel(dot)de>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>
Subject: Re: Synch failover WAS: Support for N synchronous standby servers - take 2
Date: 2015-07-03 11:40:04
Message-ID: 20150703114004.GD3291@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-07-02 14:54:19 -0700, Josh Berkus wrote:
> On 07/02/2015 12:44 PM, Andres Freund wrote:
> > On 2015-07-02 11:50:44 -0700, Josh Berkus wrote:
> >> So there's two parts to this:
> >>
> >> 1. I need to ensure that data is replicated to X places.
> >>
> >> 2. I need to *know* which places data was synchronously replicated to
> >> when the master goes down.
> >>
> >> My entire point is that (1) alone is useless unless you also have (2).
> >
> > I think there's a good set of usecases where that's really not the case.
>
> Please share! My plea for usecases was sincere. I can't think of any.

"I have important data. I want to survive both a local hardware failure
(it's faster to continue using the local standby) and I want to protect
myself against actual disaster striking the primary datacenter". Pretty
common.

> >> And do note that I'm talking about information on the replica, not on
> >> the master, since in any failure situation we don't have the old
> >> master around to check.
> >
> > How would you, even theoretically, synchronize that knowledge to all the
> > replicas? Even when they're temporarily disconnected?
>
> You can't, which is why what we need to know is when the replica thinks
> it was last synced from the replica side. That is, a sync timestamp and
> lsn from the last time the replica ack'd a sync commit back to the
> master successfully. Based on that information, I can make an informed
> decision, even if I'm down to one replica.

I think you're mashing together nearly unrelated topics.

Note that we already have the last replayed lsn, and we have the
timestamp of the last replayed transaction.

> > If you want automated failover you need a leader election amongst the
> > surviving nodes. The replay position is all they need to elect the node
> > that's furthest ahead, and that information exists today.
>
> I can do that already. If quorum synch commit doesn't help us minimize
> data loss any better than async replication or the current 1-redundant,
> why would we want it? If it does help us minimize data loss, how?

But it does make us safer against data loss? If your app gets back the
commit you know that the data has made it both to the local replica and
one other datacenter. And you're now safe agains the loss of either the
master's hardware (most likely scenario) and safe against the loss of
the entire primary datacenter. That you need additional logic to know to
which other datacenter to fail over is just yet another piece (which you
*can* build today).

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2015-07-03 11:42:03 Re: pgbench - allow backslash-continuations in custom scripts
Previous Message Beena Emerson 2015-07-03 11:29:46 Re: Synch failover WAS: Support for N synchronous standby servers - take 2