Re: BUG #8686: Standby could not restart.

From: Tomonari Katsumata <t(dot)katsumata1122(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Tomonari Katsumata <katsumata(dot)tomonari(at)po(dot)ntts(dot)co(dot)jp>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #8686: Standby could not restart.
Date: 2013-12-23 06:15:17
Message-ID: CAC55fYfGp3zfp1ySJc_QCJycQt4iJ0ch2S77S4gWQLj-x_pp1g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi Heikki,
Thanks for your confirmation and comments.

>
>
> /*
>> * Initialize shared replayEndRecPtr,
>> lastReplayedEndRecPtr, and
>> * recoveryLastXTime.
>> *
>> * This is slightly confusing if we're starting from an
>> online
>> * checkpoint; we've just read and replayed the
>> checkpoint record, but
>> * we're going to start replay from its redo pointer,
>> which precedes
>> * the location of the checkpoint record itself. So even
>> though the
>> * last record we've replayed is indeed ReadRecPtr, we
>> haven't
>> * replayed all the preceding records yet. That's OK for
>> the current
>> * use of these variables.
>> */
>> SpinLockAcquire(&xlogctl->info_lck);
>> xlogctl->replayEndRecPtr = ReadRecPtr;
>> xlogctl->lastReplayedEndRecPtr = EndRecPtr;
>> xlogctl->recoveryLastXTime = 0;
>> xlogctl->currentChunkStartTime = 0;
>> xlogctl->recoveryPause = false;
>> SpinLockRelease(&xlogctl->info_lck);
>>
>
> I think we need to fix that confusion. Your patch will do it by not
> setting EndRecPtr yet; that fixes the bug, but leaves those variables in a
> slightly strange state; I'm not sure what EndRecPtr points to in that case
> (0 ?), but ReadRecPtr would be set I guess.
>
Yes, the values were set like below.
ReadRecPtr:1/8E7F0B0
EndRecPtr:0/0

>
> Perhaps we should reset replayEndRecPtr and lastReplayedEndRecPtr to the
> REDO point here, instead of ReadRecPtr/EndRecPtr.
>

I made another patch.
I added a ReadRecord to make sure the REDO location is present or not.
The similar process are done when we use backup_label.

Because the ReadRecord returns a record already read,
I set ReadRecPtr of the record to EndRecPtr.
And also I set record->xl_prev to ReadRecPtr.
As you said, it also worked fine.

I'm not sure we should do same thing when crash recovery occurs, but now I
added the process when archive recovery is needed.

Please see attached patch.

regards,
---------------------
Tomonari Katsumata

Attachment Content-Type Size
making_sure_ckptredo_from_pg_control.patch application/octet-stream 973 bytes

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message digoal 2013-12-23 09:41:07 BUG #8697: checkpoint cann't flush unlogged table's dirty page to disk.
Previous Message Vik Fearing 2013-12-23 01:21:48 Re: BUG #8696: Type-checking seems to fail on UNIONs with arrays