Re: Patch for fail-back without fresh backup

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Amit Langote'" <amitlangote09(at)gmail(dot)com>
Cc: "'Samrat Revagade'" <revagade(dot)samrat(at)gmail(dot)com>, "'PostgreSQL-development'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch for fail-back without fresh backup
Date: 2013-06-26 04:40:24
Message-ID: 004a01ce7227$46a4a790$d3edf6b0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday, June 25, 2013 10:23 AM Amit Langote wrote:
> Hi,
>
> >
> >> So our proposal on this problem is that we must ensure that master
> should
> > not make any file system level changes without confirming that the
> >> corresponding WAL record is replicated to the standby.
> >
> > How will you take care of extra WAL on old master during recovery.
> If it
> > plays the WAL which has not reached new-master, it can be a problem.
> >
>
> I am trying to understand how there would be extra WAL on old master
> that it would replay and cause inconsistency. Consider how I am
> picturing it and correct me if I am wrong.
>
> 1) Master crashes. So a failback standby becomes new master forking the
> WAL.
> 2) Old master is restarted as a standby (now with this patch, without
> a new base backup).
> 3) It would try to replay all the WAL it has available and later
> connect to the new master also following the timeline switch (the
> switch might happen using archived WAL and timeline history file OR
> the new switch-over-streaming-replication-connection as of 9.3,
> right?)
>
> * in (3), when the new standby/old master is replaying WAL, from where
> is it picking the WAL?
Yes, this is the point which can lead to inconsistency, new standby/old master
will replay WAL after the last successful checkpoint, for which he get info from
control file. It is picking WAL from the location where it was logged when it was active (pg_xlog).

> Does it first replay all the WAL in pg_xlog
> before archive? Should we make it check for a timeline history file in
> archive before it starts replaying any WAL?

I have really not thought what is best solution for problem.

> * And, would the new master, before forking the WAL, replay all the
> WAL that is necessary to come to state (of data directory) that the
> old master was just before it crashed?

I don't think new master has any correlation with old master's data directory,
Rather it will replay the WAL it has received/flushed before start acting as master.

With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2013-06-26 04:51:00 Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])
Previous Message Amit Langote 2013-06-26 04:27:15 Computer VARSIZE_ANY(PTR) during debugging