Re: Failback with log shipping

From: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Failback with log shipping
Date: 2010-05-28 19:20:14
Message-ID: m2fx1bn89d.fsf@hi-media.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> Not shipped before the first failover you mean? No, if any WAL records were
> created in the old master that were not shipped to the standby before
> failover, the corresponding changes to the data files might've been flushed
> to disk already, and you can't undo those by not replaying the WAL record on
> restart.

Ah yes you need to fail between when (WAL is written and not sent) and
CHECKPOINT for this to be possible. But automatic testing of the
situation (is the data already safe in PGDATA) might still be possible?

>> How easy is it to script that? It seems all we need is the current XID
>> of the slave at the end of recovery. It should be in the log, maybe it's
>> easy enough to expose it at the SQL level…
>
> XID doesn't help at all, LSN more likely, but I feel that I don't fully
> understand what you're saying.

Sorry I was unclear, I was thinking in terms of recovery.conf file and
either recovery_target_xid or recovery_target_time. The idea being that
if the old-master didn't CHECKPOINT the changes that the slave missed,
then we can do crash recovery and choose to stop before that point, then
apply WALs from the new master.

That might sounds like a strange thing to do, but if switching from
master to slave allows skipping the base backup to get a slave again, I
guess we'll see people choosing the all automated failover scripting
(with heartbeat and so on). The goal would be to reduce downtime the
more you can.

When possible I'd still choose manual failover to the slave after a
master's restart and crash recovery, but the downtime constraint might
not allow that everywhere.

So you're saying controlled failover could possibly skip base backup to
reuse old master as new slave, and I'm asking if by some luck (crash
happened before CHECKPOINT) and some recovery.conf setup we could get to
the same situation in case of hard failure. That would allow completely
automatic switchover / failover with no need to resync.

I'm not sure how much clearer I managed to be :)

Regards,
--
dim

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2010-05-28 19:20:22 Re: How to pass around collation information
Previous Message Robert Haas 2010-05-28 19:06:20 Re: functional call named notation clashes with SQL feature