Re: Synchronization levels in SR

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Markus Wanner <markus(at)bluegap(dot)ch>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Synchronization levels in SR
Date: 2010-09-07 15:17:22
Message-ID: 10419.1283872642@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Markus Wanner <markus(at)bluegap(dot)ch> writes:
> On 09/07/2010 04:15 PM, Robert Haas wrote:
>> In theory, that's true, but if we do that, then there's an even bigger
>> problem: the slave might have replayed WAL ahead of the master
>> location; therefore the slave is now corrupt and a new base backup
>> must be taken.

> The slave isn't corrupt. It would suffice to "late abort" committed
> transactions the master doesn't know about.

Oh yes it is. If the slave replays WAL that didn't happen on the
master, it might for instance have heap tuples in TID slots that are
empty on the master, or index pages laid out differently from the
master. Trying to apply additional WAL from the master will fail badly.

We can *not* allow the slave to replay WAL ahead of what is known
committed to disk on the master. The only way to make that safe
is the compare-notes-and-ship-WAL-back approach that Robert mentioned.

If you feel that decoupling WAL application is absolutely essential
to have a credible feature, then you'd better bite the bullet and
start working on the ship-WAL-back code.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2010-09-07 15:18:27 Re: can we publish a aset interface?
Previous Message Tom Lane 2010-09-07 15:07:26 Re: git: uh-oh