Re: PITR promote bug: Checkpointer writes to older timeline

From: Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, jyih(at)vmware(dot)com, kyeap(at)vmware(dot)com
Subject: Re: PITR promote bug: Checkpointer writes to older timeline
Date: 2021-03-03 22:56:25
Message-ID: CAE-ML+_7Fgz9rp-hX0HJL+Y+w0irXfXvKA04AcBcNDxPCT3=6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021/03/03 17:46, Heikki Linnakangas wrote:

> I think it should be reset even earlier, inside XlogReadTwoPhaseData()
> probably. With your patch, doesn't the LogStandbySnapshot() call just
> above where you're ressetting ThisTimeLineID also write a WAL record
> with incorrect timeline?

Agreed.

On Wed, Mar 3, 2021 at 1:04 AM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:

> > Even better, can we avoid setting ThisTimeLineID in XlogReadTwoPhaseData() in the first place?
>
>
>
> Or isn't it better to reset ThisTimeLineID in read_local_xlog_page(), i.e.,
> prevent read_local_xlog_page() from changing ThisTimeLineID? I'm not
> sure if that's possible, though.. In the future other functions that calls
> read_local_xlog_page() during the promotion may appear. Fixing the issue
> outside read_local_xlog_page() may cause those functions to get
> the same issue.

I agree. We should fix the issue in read_local_xlog_page(). I have
attached two different patches which do so:
saved_ThisTimeLineID.patch and pass_ThisTimeLineID.patch.

The former saves the value of the ThisTimeLineID before it gets changed
in read_local_xlog_page() and resets it after ThisTimeLineID has been
used later on in the code (by XLogReadDetermineTimeline()).

The latter removes occurrences of ThisTimeLineID from
XLogReadDetermineTimeline() and introduces an argument currTLI to
XLogReadDetermineTimeline() to be used in its stead.

Regards,
Soumyadeep

Attachment Content-Type Size
pass_ThisTimeLineID.patch text/x-patch 4.8 KB
saved_ThisTimeLineID.patch text/x-patch 853 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-03-03 23:32:58 Re: Removing support for COPY FROM STDIN in protocol version 2
Previous Message Thomas Munro 2021-03-03 22:54:23 Re: Fix DROP TABLESPACE on Windows with ProcSignalBarrier?