Quick Links

Re: BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc:	amit(dot)kapila(at)huawei(dot)com, pgsql-bugs(at)postgresql(dot)org
Subject:	Re: BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby
Date:	2012-09-13 12:21:38
Message-ID:	5051CFD2.60103@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

On 12.09.2012 22:03, Fujii Masao wrote:
> On Wed, Sep 12, 2012 at 8:47 PM,<amit(dot)kapila(at)huawei(dot)com> wrote:
>> The following bug has been logged on the website:
>>
>> Bug reference: 7533
>> Logged by: Amit Kapila
>> Email address: amit(dot)kapila(at)huawei(dot)com
>> PostgreSQL version: 9.2.0
>> Operating system: Suse
>> Description:
>>
>> M host is primary, S host is standby and CS host is cascaded standby.
>>
>> 1.Set up postgresql-9.2beta2/RC1 on all hosts.
>> 2.Execute command initdb on host M to create fresh database.
>> 3.Modify the configure file postgresql.conf on host M like this：
>> listen_addresses = 'M'
>> port = 15210
>> wal_level = hot_standby
>> max_wal_senders = 4
>> hot_standby = on
>> 4.modify the configure file pg_hba.conf on host M like this：
>> host replication repl M/24 md5
>> 5.Start the server on host M as primary.
>> 6.Connect one client to primary server and create a user ‘repl’
>> Create user repl superuser password '123';
>> 7.Use the command pg_basebackup on the host S to retrieve database of
>> primary host
>> pg_basebackup -D /opt/t38917/data -F p -x fetch -c fast -l repl_backup -P
>> -v -h M -p 15210 -U repl –W
>> 8. Copy one recovery.conf.sample from share folder of package to database
>> folder of the host S. Then rename this file to recovery.conf
>> 9.Modify the file recovery.conf on host S as below:
>> standby_mode = on
>> primary_conninfo = 'host=M port=15210 user=repl password=123'
>> 10. Modify the file postgresql.conf on host S as follow:
>> listen_addresses = 'S'
>> 11.Start the server on host S as standby server.
>> 12.Use the command pg_basebackup on the host CS to retrieve database of
>> standby host
>> pg_basebackup -D /opt/t38917/data -F p -x fetch -c fast -l repl_backup -P
>> -v -h M -p 15210 -U repl –W
>> 13.Modify the file recovery.conf on host CS as below:
>> standby_mode = on
>> primary_conninfo = 'host=S port=15210 user=repl password=123'
>> 14. Modify the file postgresql.conf on host S as follow:
>> listen_addresses = 'CS'
>> 15.Start the server on host CS as Cascaded standby server node.
>> 16. Try to connect a client to host CS but it gives error as:
>> FATAL: the database system is starting up
>
> This procedures didn't reproduce the problem in HEAD. But when I restarted
> the master server between the step 11 and 12, I was able to reproduce the
> problem.
>
>> Observations related to bug
>> ------------------------------
>> In the above scenario it is observed that Start-up process has read all data
>> (in our defect scenario minRecoveryPoint is 5016220) till the position
>> 5016220 and then it goes and check for recovery consistency by following
>> condition in function CheckRecoveryConsistency:
>> if (!reachedConsistency&&
>> XLByteLE(minRecoveryPoint, EndRecPtr)&&
>> XLogRecPtrIsInvalid(ControlFile->backupStartPoint))
>>
>> At this point first two conditions are true but last condition is not true
>> because still redo has not been applied and hence backupStartPoint has not
>> been reset. So it does not signal postmaster regarding consistent stage.
>> After this it goes and applies the redo and then reset backupStartPoint and
>> then it goes to read next set of record. Since all records have been already
>> read, so it starts waiting for the new record from the Standby node. But
>> since there is no new record from Standby node coming so it keeps waiting
>> for that and it does not get chance to recheck the recovery consistent
>> level. And hence client connection does not get allowed.
>
> If cascaded standby starts a recovery at a normal checkpoint record,
> this problem will not happen. Because if wal_level is set to hot_standby,
> XLOG_RUNNING_XACTS WAL record always follows after the normal
> checkpont record. So while XLOG_RUNNING_XACTS record is being replayed,
> ControlFile->backupStartPoint can be reset, and then cascaded standby
> can pass through the consistency test.
>
> The problem happens when cascaded standby starts a recovery at a
> shutdown checkpoint record. In this case, no WAL record might follow
> the checkpoint one yet. So, after replaying the shutdown checkpoint
> record, cascaded standby needs to wait for new WAL record to appear
> before reaching the code block for resetting ControlFile->backupStartPoint.
> The cascaded standby cannot reach a consistent state and a client cannot
> connect to the cascaded standby until new WAL has arrived.
>
> Attached patch will fix the problem. In this patch, if recovery is
> beginning at a shutdown checkpoint record, any ControlFile fields
> (like backupStartPoint) required for checking that an end-of-backup is
> reached are not set at first. IOW, cascaded standby thinks that the
> database is consistent from the beginning. This is safe because
> a shutdown checkpoint record means that there is no running database
> activity at that point and the database is in consistent state.

Hmm, I think the CheckRecoveryConsistency() call in the redo loop is
misplaced. It's called after we got a record from ReadRecord, but
*before* replaying it (rm_redo). Even if replaying record X makes the
system consistent, we won't check and notice that until we have fetched
record X+1. In this particular test case, record X is a shutdown
checkpoint record, but it could as well be a running-xacts record, or
the record that reaches minRecoveryPoint.

Does the problem go away if you just move the CheckRecoveryConsistency()
call *after* rm_redo (attached)?

- Heikki

Attachment	Content-Type	Size
move-check-consistency.patch	text/x-diff	765 bytes

In response to

Re: BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby at 2012-09-12 19:03:56 from Fujii Masao

Responses

Re: BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby at 2012-09-13 13:18:47 from Amit Kapila
Re: BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby at 2012-09-13 17:02:24 from Fujii Masao

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Amit Kapila	2012-09-13 13:18:47	Re: BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby
Previous Message	Amit Kapila	2012-09-13 06:51:22	Re: BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby