Re: BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: <hlinnaka(at)iki(dot)fi>, "'Fujii Masao'" <masao(dot)fujii(at)gmail(dot)com>
Cc: <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby
Date: 2012-09-13 13:18:47
Message-ID: 007201cd91b2$4f1e89a0$ed5b9ce0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thursday, September 13, 2012 5:52 PM Heikki Linnakangas wrote:
On 12.09.2012 22:03, Fujii Masao wrote:
> On Wed, Sep 12, 2012 at 8:47 PM,<amit(dot)kapila(at)huawei(dot)com> wrote:
>> The following bug has been logged on the website:
>>
>>> Bug reference: 7533
>>> Logged by: Amit Kapila
>>> Email address: amit(dot)kapila(at)huawei(dot)com
>>> PostgreSQL version: 9.2.0
>>> Operating system: Suse
>>> Description:
>>
>>> M host is primary, S host is standby and CS host is cascaded standby.
>>

>Hmm, I think the CheckRecoveryConsistency() call in the redo loop is misplaced. It's called after we got a record from
> ReadRecord, but *before* replaying it (rm_redo). Even if replaying record X makes the system consistent, we won't check
> and notice that until we have fetched record X+1. In this particular test case, record X is a shutdown checkpoint
> record, but it could as well be a running-xacts record, or the record that reaches minRecoveryPoint.

> Does the problem go away if you just move the CheckRecoveryConsistency() call *after* rm_redo (attached)?

This will resolve the problem I have reported but moving down might create another problem as due to function
recoveryPauseHere(), the recovery might get paused and in current code a client might be able to connect even in that state as CheckRecoveryConsistency() is done before that. However after suggested change it might happen that Client will not be able to connect.
If you see any problem in what I explain then can we think of calling CheckRecoveryConsistency() both at current place and the place you have proposes. If my description doesn't make any sense (as that is only my suspicion) then we can move the function down and fix the defect.

Thank you for giving so quick response about the defect.

With Regards,
Amit Kapila.

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Dimitri Fontaine 2012-09-13 15:28:29 Re: BUG #6704: ALTER EXTENSION postgis SET SCHEMA leaves dangling relations
Previous Message Heikki Linnakangas 2012-09-13 12:21:38 Re: BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby