[PATCH] A crash and subsequent recovery of the master can cause the slave to get out-of-sync

From: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: [PATCH] A crash and subsequent recovery of the master can cause the slave to get out-of-sync
Date: 2007-04-19 20:37:33
Message-ID: 4627D30D.8020803@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

I believe I have discovered the following problem in pgsql 8.2 and HEAD,
concerning warm-standbys using WAL log shipping.

The problem is that after a crash, the master might complete incomplete
actions via rm_cleanup() - but since it won't wal-log those changes,
the slave won't know about this. This will at least prevent the creation
of any further restart points on the slave (because safe_restartpoint)
will never return true again - it it might even cause data corruption,
if subsequent wal records are interpreted wrongly by the slave because
it sees other data than the master did when it generated them.

Attached is a patch that lets RecoveryRestartPoint call all
rm_cleanup() methods and create a restart point whenever it encounters
a shutdown checkpoint in the wal (because those are generated after
recovery). This ought not cause a performance degradation, because
shutdown checkpoints will occur very infrequently.

The patch is per discussion with Simon Riggs.

I've not yet had a chance to test this patch, I only made sure
that it compiles. I'm sending this out now because I hope this
might make it into 8.2.4.

greetings, Florian Pflug

Attachment Content-Type Size
incompleteactions-bugfix.patch text/x-patch 4.9 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Martijn van Oosterhout 2007-04-19 21:04:49 Re: parser dilemma
Previous Message Pavel Stehule 2007-04-19 18:13:49 pgsql crollable cursor doesn't support one form of postgresql's cursors