Re: Fast promotion failure

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fast promotion failure
Date: 2013-05-20 21:00:59
Message-ID: CA+U5nM+Lp8ryxRz-qen6Jr7v+2N40ViarvLo6mMzQ16hYoTzcQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 20 May 2013 20:40, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:
> On 20.05.2013 22:18, Simon Riggs wrote:
>>
>> On 20 May 2013 18:47, Heikki Linnakangas<hlinnakangas(at)vmware(dot)com> wrote:
>>>
>>> Not sure what the best fix would be. Perhaps change the code in
>>>
>>> CreateRestartPoint() to do something like this instead:
>>>
>>> GetXLogReplayRecPtr(&replayTLI);
>>> if (RecoveryInProgress())
>>> ThisTimeLineID = replayTLI;
>>
>>
>> Installing a few extra WAL files doesn't seem to be a good reason to
>> mess with such an important variable as ThisTimeLineID, especially
>> since its such a rare event and hardly worth optimising for.
>>
>> I would prefer it if we didn't set ThisTimeLineID at all in that way,
>> or anywhere else. If we do, it needs to be done safely, otherwise any
>> caller could make the same mistake.
>
>
>> Suggested patch attached.
>
>
>> @@ -7466,14 +7468,6 @@ CreateRestartPoint(int flags)
>>
>> * not be used, and will go wasted until recycled on the
>> next
>> * restartpoint. We'll live with that.
>> */
>> - (void) GetXLogReplayRecPtr(&ThisTimeLineID);
>> -
>> - RemoveOldXlogFiles(_logSegNo, endptr);
>> -
>> - /*
>> - * Make more log segments if needed. (Do this after
>> recycling old log
>> - * segments, since that may supply some of the needed
>> files.)
>> - */
>> PreallocXlogFiles(endptr);
>> }
>
>
> That's essentially reverting commit 343ee00b7, resurrecting the bug that
> that fixed.

Hmm, OK.

Then we should using the correct value, not the one that patch set. It
certainly worked at that time, but not in a principled way.

When we set the new timeline we should be updating all values that
might be used elsewhere. If we do that, then no matter when or how we
run GetXLogReplayRecPtr, it can't ever get it wrong in any backend.

Patch attached.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
set_lastReplayedTLI.v1.patch application/octet-stream 799 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2013-05-20 21:20:10 Re: Better LWLocks with compare-and-swap (9.4)
Previous Message Bruce Momjian 2013-05-20 20:41:07 Re: Better LWLocks with compare-and-swap (9.4)