Re: Synchronization levels in SR

From: Markus Wanner <markus(at)bluegap(dot)ch>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Synchronization levels in SR
Date: 2010-09-07 15:55:49
Message-ID: 4C866085.6070609@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 09/07/2010 05:17 PM, Tom Lane wrote:
> Oh yes it is. If the slave replays WAL that didn't happen on the
> master, it might for instance have heap tuples in TID slots that are
> empty on the master, or index pages laid out differently from the
> master. Trying to apply additional WAL from the master will fail badly.

Sure. Reverting to the master's state would be required to be able to
safely proceed. Granted, that's far from simple.

Robert's argument about read queries on the standby convinced me, that
you always need to recover to the node with the newest transactions
applied (i.e. better advance rather than revert). Making sure the
standby can't ever be ahead of the master node certainly is the simplest
way to guarantee that. At its cost for normal operation, though.

How about a master failure which leads to a fail-over, immediately
followed by a failure of that former standby (and now a master)? The old
master might then be in the very same situation: having WAL applied that
the new master doesn't. Do we require former masters to fetch a base
backup? How does it know the difference, once it gets back up?

> We can *not* allow the slave to replay WAL ahead of what is known
> committed to disk on the master. The only way to make that safe
> is the compare-notes-and-ship-WAL-back approach that Robert mentioned.

Agreed.

(And it's worth pointing out that this approach has a pretty nasty
requirement for a full-cluster crash: all nodes that were synchronously
replicated to need to come back up after such a crash, so as to be able
to reliably determine which has the newest transaction).

> If you feel that decoupling WAL application is absolutely essential
> to have a credible feature, then you'd better bite the bullet and
> start working on the ship-WAL-back code.

My feeling is that WAL is the wrong format to do replication. But that's
a another story.

Regards

Markus Wanner

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2010-09-07 15:59:31 Re: Synchronous replication - patch status inquiry
Previous Message Robert Haas 2010-09-07 15:54:16 Re: Synchronous replication - patch status inquiry