From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila(at)huawei(dot)com> |
Cc: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [BUG?] lag of minRecoveryPont in archive recovery |
Date: | 2012-12-09 15:36:31 |
Message-ID: | CAHGQGwG4W5QZ7+LJimg8xxuevwz0bYniHmZLZmWf0j6kBiuRCg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Dec 6, 2012 at 8:39 PM, Amit Kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
> On Thursday, December 06, 2012 9:35 AM Kyotaro HORIGUCHI wrote:
>> Hello, I have a problem with PostgreSQL 9.2 with Pacemaker.
>>
>> HA standby sometime failes to start under normal operation.
>>
>> Testing with a bare replication pair showed that the standby failes
>> startup recovery under the operation sequence shown below. 9.3dev too,
>> but 9.1 does not have this problem. This problem became apparent by the
>> invalid-page check of xlog, but
>> 9.1 also has same glitch potentially.
>>
>> After the investigation, the lag of minRecoveryPoint behind EndRecPtr in
>> redo loop seems to be the cause. The lag brings about repetitive redoing
>> of unrepeatable xlog sequences such as XLOG_HEAP2_VISIBLE ->
>> SMGR_TRUNCATE on the same page. So I did the same aid work as
>> xact_redo_commit_internal for smgr_redo. While doing this, I noticed
>> that
>> CheckRecoveryConsistency() in redo apply loop should be after redoing
>> the record, so moved it.
>
> I think moving CheckRecoveryConsistency() after redo apply loop might cause
> a problem.
> As currently it is done before recoveryStopsHere() function, which can allow
> connections
> on HOTSTANDY. But now if due to some reason recovery pauses or stops due to
> above function,
> connections might not be allowed as CheckRecoveryConsistency() is not
> called.
Yes, so we should just add the CheckRecoveryConsistency() call after
rm_redo rather than moving it? This issue is related to the old discussion:
http://archives.postgresql.org/pgsql-bugs/2012-09/msg00101.php
Regards,
--
Fujii Masao
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2012-12-09 15:41:34 | Re: [BUG?] lag of minRecoveryPont in archive recovery |
Previous Message | Magnus Hagander | 2012-12-09 14:57:25 | Re: pg_basebackup is taking backup of extra files inside a tablespace directory |