Re: streaming replication breaks horribly if master crashes

From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: streaming replication breaks horribly if master crashes
Date: 2010-06-16 20:07:26
Message-ID: 4C192EFE.10204@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06/16/2010 09:47 PM, Robert Haas wrote:
> On Mon, Jun 14, 2010 at 7:55 AM, Simon Riggs<simon(at)2ndquadrant(dot)com> wrote:
>>> But that change would cause the problem that Robert pointed out.
>>> http://archives.postgresql.org/pgsql-hackers/2010-06/msg00670.php
>>
>> Presumably this means that if synchronous_commit = off on primary that
>> SR in 9.0 will no longer work correctly if the primary crashes?
>
> I spent some time investigating this today and have come to the
> conclusion that streaming replication is really, really broken in the
> face of potential crashes on the master. Using a copy of VMware
> parallels provided by $EMPLOYER, I set up two Fedora 12 virtual
> machines on my MacBook in a master/slave configuration. Then I
> crashed the master repeatedly using 'echo b> /proc/sysrq-trigger',
> which causes an immediate reboot (without syncing the disks, closing
> network connections, etc.) while running pgbench or other stuff
> against it.
>
> The first problem I noticed is that the slave never seems to realize
> that the master has gone away. Every time I crashed the master, I had
> to kill the wal receiver process on the slave to get it to reconnect;
> otherwise it just sat there waiting, either forever or at least for
> longer than I was willing to wait.

well this is likely caused by the OS not noticing that the connections
went away (linux has really long timeouts here) - maybe we should
unconditionally enable keepalive on systems that support that for
replication connections (if that is possible in the current design anyway)

>
> More seriously, I was able to demonstrate that the problem linked in
> the thread above is real: if the master crashes after streaming WAL
> that it hasn't yet fsync'd, then on recovery the slave's xlog position
> is ahead of the master. So far I've only been able to reproduce this
> with fsync=off, but I believe it's possible anyway, and this just
> makes it more likely. After the most recent crash, the master thought
> pg_current_xlog_location() was 1/86CD4000; the slave thought
> pg_last_xlog_receive_location() was 1/8733C000. After reconnecting to
> the master, the slave then thought that
> pg_last_xlog_receive_location() was 1/87000000. The slave didn't
> think this was a problem yet, though. When I then restarted a pgbench
> run against the master, the slave pretty quickly started spewing an
> endless stream of messages complaining of "LOG: invalid record length
> at 1/8733A828".

this is obviously bad but with fsync=off(or sync_commit=off?) it is
probably impossible to prevent...

Stefan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2010-06-16 20:07:54 Re: streaming replication breaks horribly if master crashes
Previous Message Kevin Grittner 2010-06-16 20:00:37 Re: streaming replication breaks horribly if master crashes