Re: BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Fujii Masao'" <masao(dot)fujii(at)gmail(dot)com>
Cc: <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby
Date: 2012-09-13 06:51:22
Message-ID: 004c01cd917c$3078d590$916a80b0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thursday, September 13, 2012 12:34 AM Fujii Masao wrote:
On Wed, Sep 12, 2012 at 8:47 PM, <amit(dot)kapila(at)huawei(dot)com> wrote:
>> The following bug has been logged on the website:
>
>> Bug reference: 7533
>> Logged by: Amit Kapila
>> Email address: amit(dot)kapila(at)huawei(dot)com
>> PostgreSQL version: 9.2.0
>> Operating system: Suse
>> Description:
>
>> M host is primary, S host is standby and CS host is cascaded standby.
>

> This procedures didn't reproduce the problem in HEAD. But when I restarted the master server between the step 11 and
> 12, I was able to reproduce the problem.

We also observed that it didn't appear in 9.2rc1 due to commit b8b69d89905e04b910bcd65efce1791477b45d35 by Tom.
The reason is checkpoint WAL will come from master after above fix , and cascaded standby will not stuck in LOOP.
However if we increase Checkpoint interval, this does appear, some times we need to try 4-5 times.

>> Observations related to bug
>> ------------------------------
>> In the above scenario it is observed that Start-up process has read
>> all data (in our defect scenario minRecoveryPoint is 5016220) till the
>> position 5016220 and then it goes and check for recovery consistency
>> by following condition in function CheckRecoveryConsistency:
>> if (!reachedConsistency &&
>> XLByteLE(minRecoveryPoint, EndRecPtr) &&
>> XLogRecPtrIsInvalid(ControlFile->backupStartPoint))
>>
>> At this point first two conditions are true but last condition is not
>> true because still redo has not been applied and hence
>> backupStartPoint has not been reset. So it does not signal postmaster regarding consistent stage.
>> After this it goes and applies the redo and then reset
>> backupStartPoint and then it goes to read next set of record. Since
>> all records have been already read, so it starts waiting for the new
>> record from the Standby node. But since there is no new record from
>> Standby node coming so it keeps waiting for that and it does not get
>> chance to recheck the recovery consistent level. And hence client connection does not get allowed.

If cascaded standby starts a recovery at a normal checkpoint record, this problem will not happen. Because if wal_level is set to hot_standby, XLOG_RUNNING_XACTS WAL record always follows after the normal checkpont record. So while XLOG_RUNNING_XACTS record is being replayed,
ControlFile->backupStartPoint can be reset, and then cascaded standby
can pass through the consistency test.

> The problem happens when cascaded standby starts a recovery at a shutdown checkpoint record. In this case, no WAL
> record might follow the checkpoint one yet. So, after replaying the shutdown checkpoint record, cascaded standby needs > to wait for new WAL record to appear before reaching the code block for resetting ControlFile->backupStartPoint.
> The cascaded standby cannot reach a consistent state and a client cannot connect to the cascaded standby until new WAL > has arrived.

In the above scenario, we are not doing shutdown so how can shutdown checkpoint record can come.

Also for the normal checkpoint case I have done brief analysis:
I have observed in code that ControlFile->minRecoveryPoint is updated while replaying XLOG_BACKUP_END WAL.
On hot standby S, this means that ControlFile->minRecoveryPoint will point to a lsn after checkpoint record.

Now when start recovery on cascaded standby CS after basebackup from hot standby S, minRecoveryPoint should point to lsn after checkpoint record lsn, so it might create problem.

> Attached patch will fix the problem. In this patch, if recovery is beginning at a shutdown checkpoint record, any
> ControlFile fields (like backupStartPoint) required for checking that an end-of-backup is reached are not set at first. > IOW, cascaded standby thinks that the database is consistent from the beginning. This is safe because a shutdown
> checkpoint record means that there is no running database activity at that point and the database is in consistent
> state.

I shall test with the patch.

With Regards,
Amit Kapila.

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Heikki Linnakangas 2012-09-13 12:21:38 Re: BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby
Previous Message bugs 2012-09-13 06:39:21 BUG #7536: run arbitrary -c setup command before interaction [wishlist]