Re: Inconsistent DB data in Streaming Replication

From: Shaun Thomas <sthomas(at)optionshouse(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, 'Samrat Revagade' <revagade(dot)samrat(at)gmail(dot)com>, 'Hannu Krosing' <hannu(at)2ndquadrant(dot)com>, 'Fujii Masao' <masao(dot)fujii(at)gmail(dot)com>, 'PostgreSQL-development' <pgsql-hackers(at)postgresql(dot)org>, <ants(at)cybertec(dot)at>, <andres(at)2ndquadrant(dot)com>
Subject: Re: Inconsistent DB data in Streaming Replication
Date: 2013-04-10 14:26:11
Message-ID: 51657683.1070408@optionshouse.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 04/10/2013 09:10 AM, Tom Lane wrote:

> IOW, I wouldn't consider skipping the rsync even if I had a feature
> like this.

Totally. Out in the field, we consider the "old" database corrupt the
moment we fail over. There is literally no way to verify the safety of
any data along the broken chain, given race conditions and multiple
potential failure points.

The only potential use case for this that I can see, would be for system
maintenance and a controlled failover. I agree: that's a major PITA when
doing DR testing, but I personally don't think this is the way to fix
that particular edge case.

Maybe checksums will fix this in the long run... I don't know. DRBD has
a handy block-level verify function for things like this, and it can
re-sync master/slave data by comparing the commit log across the servers
if you tell it one node should be considered incorrect.

The thing is... we have clogs, and we have WAL. If we can assume
bidirectional communication and verification (checksum comparison?) of
both of those components, the database *should* be able to re-sync itself.

Even if that were possible given the internals, I can't see anyone
jumping on this before 9.4 or 9.5 unless someone sponsors the feature.

Automatic re-sync would (within available WALs) be an awesome feature,
though...

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-676-8870
sthomas(at)optionshouse(dot)com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dang Minh Huong 2013-04-10 14:37:44 Re: replication_timeout not effective
Previous Message Bruce Momjian 2013-04-10 14:23:42 Re: Enabling Checksums