From: | "Florian G(dot) Pflug" <fgp(at)phlo(dot)org> |
---|---|
To: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [PATCH] A crash and subsequent recovery of themaster can cause the slave to get out-of-sync |
Date: | 2007-04-23 11:19:42 |
Message-ID: | 462C964E.90405@phlo.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Simon Riggs wrote:
> On Thu, 2007-04-19 at 22:37 +0200, Florian G. Pflug wrote:
>> The problem is that after a crash, the master might complete incomplete
>> actions via rm_cleanup() - but since it won't wal-log those changes,
>> the slave won't know about this. This will at least prevent the creation
>> of any further restart points on the slave (because safe_restartpoint)
>> will never return true again - it it might even cause data corruption,
>> if subsequent wal records are interpreted wrongly by the slave because
>> it sees other data than the master did when it generated them.
>
> I agree the problem exists. It is somewhat rare because the idea is that
> if the Primary crashes we would failover to the Standby, which would
> mean that both Primary and Standby have executed rm_cleanup(), if
> needed.
>
> So in the case where the Primary fails and we choose *not* to failover,
> there is a potential difficulty on the Standby.
It occured to me today that this might plague 8.1 too. As you explain below,
the problem is not really connected to restartpoints - failing to create
them is merely a sympton of this. On 8.1, this might still lead to rm_cleanup()
being called much "later" (if you consider the wal position to be the "time")
on the slave than on the master. I dunno if this really causes trouble - I
don't yet understand the btree code well enough to judge this.
> The rationale for this fix could be described somewhat differently:
>
> When we shutdown, we know for certain that safe_restartpoint() is true.
> However, we don't know whether it is true because we successfully did a
> clean shutdown, or because we crashed, recovered and then issued a
> shutdown checkpoint following recovery. In the latter case we *must*
> execute rm_cleanup() on the standby because it has been executed on the
> primary. Not doing so at this point *might* be safe, but is not certain
> to be safe. We don't *need* to log a restartpoint at this time, but it
> seems sensible to do so.
While creating the patch, I've been thinking if it might be worthwile
to note that we just did recovery in the ShutdownCheckpoint
(or create a new checkpoint type RecoveryCheckpoint). This wouldl allow
for more error checking, because then the slave could check that
safe_restartpoint() is true for all ShutdownCheckpoints that were not
after recovering.
> We need to check that rm_cleanup() routines don't assume that they will
> only ever be called once or this will clearly fail. There is also no
> need to call rm_cleanup() unless rm_safe_restartpoint() is false.
But a non-idempotent rm_cleanup() routine will cause trouble anyway,
if postgres crashes after having called rm_cleanup() but before creating
the ShutdownCheckpoint.
greetings, Florian Pflug
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2007-04-23 11:37:36 | Re: Dead Space Map version 3 (simplified) |
Previous Message | Gregory Stark | 2007-04-23 11:12:30 | Re: Dead Space Map version 3 (simplified) |