Re: Inconsistent DB data in Streaming Replication

From: Ants Aasma <ants(at)cybertec(dot)at>
To: sthomas(at)optionshouse(dot)com
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Samrat Revagade <revagade(dot)samrat(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Inconsistent DB data in Streaming Replication
Date: 2013-04-10 17:14:15
Message-ID: CA+CSw_ukN7NCNWgUY5cq7W-ku9D=7LXEpKfKfh+T=figjbEn+w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 10, 2013 at 7:44 PM, Shaun Thomas <sthomas(at)optionshouse(dot)com> wrote:
> On 04/10/2013 11:40 AM, Fujii Masao wrote:
>
>> Strange. If this is really true, shared disk failover solution is
>> fundamentally broken because the standby needs to start up with the
>> shared "corrupted" database at the failover.
>
>
> How so? Shared disk doesn't use replication. The point I was trying to make
> is that replication requires synchronization between two disparate servers,
> and verifying they have exactly the same data is a non-trivial exercise.
> Even a single transaction after a failover (effectively) negates the old
> server because there's no easy "catch up" mechanism yet.
>
> Even if this isn't necessarily true, it's the safest approach IMO.

We already rely on WAL-before-data to ensure correct recovery. What is
proposed here is to slightly redefine it to require WAL to be
replicated before it is considered to be flushed. This ensures that no
data page on disk differs from the WAL that the slave has. The
machinery to do this is already mostly there, we already wait for WAL
flushes and we know the write location on the slave. The second
requirement is that we never start up as master and we don't trust any
local WAL. This is actually how pacemaker clusters work, you would
only need to amend the RA to wipe the WAL and configure postgresql
with restart_after_crash = false.

It would be very helpful in restoring HA capability after failover if
we wouldn't have to read through the whole database after a VM goes
down and is migrated with the shared disk onto a new host.

Regards,
Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-04-10 17:42:23 Re: Inconsistent DB data in Streaming Replication
Previous Message Fujii Masao 2013-04-10 17:00:58 Re: Inconsistent DB data in Streaming Replication