Re: PITR promote bug: Checkpointer writes to older timeline

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, soumyadeep2007(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org, kevin(dot)yeap(at)vmware(dot)com, michael(at)paquier(dot)xyz, jyih(at)vmware(dot)com
Subject: Re: PITR promote bug: Checkpointer writes to older timeline
Date: 2021-03-03 09:04:05
Message-ID: 203f64f0-7fdd-2dae-adab-342a6763aa8f@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021/03/03 17:46, Heikki Linnakangas wrote:
> On 03/03/2021 08:47, Kyotaro Horiguchi wrote:
>> At Tue, 2 Mar 2021 17:56:03 -0800, Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com> wrote in
>>> When there are prepared transactions in an older timeline, in the
>>> checkpointer, a call to CheckPointTwoPhase() and subsequently to
>>> XlogReadTwoPhaseData() and subsequently to read_local_xlog_page() leads
>>> to the following line:
>>>
>>> read_upto = GetXLogReplayRecPtr(&ThisTimeLineID);
>>>
>>> GetXLogReplayRecPtr() will change ThisTimeLineID to 1, in order to read
>>> the two phase WAL records in the older timeline. This variable will
>>> remain unchanged and the checkpointer ends up writing the checkpoint
>>> record into the older WAL segment (when XLogBeginInsert() is called
>>> within CreateCheckPoint(), the value is still 1). The value is not
>>> synchronized as even if RecoveryInProgress() is called,
>>> xlogctl->SharedRecoveryState is not RECOVERY_STATE_DONE
>>> (SharedRecoveryInProgress = true in older versions) as the startup
>>> process waits for the checkpointer inside RequestCheckpoint() (since
>>> recovery_target_action='promote' involves a non-fast promotion). Thus,
>>> InitXLOGAccess() is not called and the value of ThisTimeLineID is not
>>> updated before the checkpoint record write.
>>>
>>> Since 1148e22a82e, GetXLogReplayRecPtr() is called with ThisTimeLineID
>>> instead of a local variable, within read_local_xlog_page().
>
> Confusing...
>
>>> PFA a small patch that fixes the problem by explicitly calling
>>> InitXLOGAccess() in CheckPointTwoPhase(), after the two phase state data
>>> is read, in order to update ThisTimeLineID to the latest timeline. It is
>>> okay to call InitXLOGAccess() as it is lightweight and would mostly be
>>> a no-op.
>>
>> It is correct that read_local_xlog_page() changes ThisTimeLineID, but
>> InitXLOGAccess() is correctly called in CreateCheckPoint:
>>
>> |    /*
>> |     * An end-of-recovery checkpoint is created before anyone is allowed to
>> |     * write WAL. To allow us to write the checkpoint record, temporarily
>> |     * enable XLogInsertAllowed.  (This also ensures ThisTimeLineID is
>> |     * initialized, which we need here and in AdvanceXLInsertBuffer.)
>> |     */
>> |    if (flags & CHECKPOINT_END_OF_RECOVERY)
>> |        LocalSetXLogInsertAllowed();
>>
>> It seems to e suficcient to recover ThisTimeLineID from the checkpoint
>> record to be written, as attached?
>
> I think it should be reset even earlier, inside XlogReadTwoPhaseData() probably. With your patch, doesn't the LogStandbySnapshot() call just above where you're ressetting ThisTimeLineID also write a WAL record with incorrect timeline?
>
> Even better, can we avoid setting ThisTimeLineID in XlogReadTwoPhaseData() in the first place?

Or isn't it better to reset ThisTimeLineID in read_local_xlog_page(), i.e.,
prevent read_local_xlog_page() from changing ThisTimeLineID? I'm not
sure if that's possible, though.. In the future other functions that calls
read_local_xlog_page() during the promotion may appear. Fixing the issue
outside read_local_xlog_page() may cause those functions to get
the same issue.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2021-03-03 09:04:58 Re: pg_upgrade version checking questions
Previous Message Peter Eisentraut 2021-03-03 09:02:17 Re: macOS SIP, next try