Re: Race condition in recovery?

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: dilipbalaut(at)gmail(dot)com
Cc: robertmhaas(at)gmail(dot)com, hlinnaka(at)iki(dot)fi, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Race condition in recovery?
Date: 2021-05-17 03:20:12
Message-ID: 20210517.122012.1625399639733083060.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Sat, 15 May 2021 10:55:05 +0530, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote in
> On Sat, May 15, 2021 at 3:58 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >
> > I did notice, but keep in mind that this was more than 8 years ago.
> > Even if Heikki is reading this thread, he may not remember why he
> > changed 1 line of code one way rather than another in 2013. I mean if
> > he does that's great, but it's asking a lot.
>
> I agree with your point. But I think that one line is related to the
> purpose of this commit and I have explained (in 3rd paragraph below)
> why do I think so.
>
> As I understand it, the general issue here was that if
> > XLogFileReadAnyTLI() was called before expectedTLEs got set, then
> > prior to this commit it would have to fail, because the foreach() loop
> > in that function would be iterating over an empty list. Heikki tried
> > to make it not fail in that case, by setting tles =
> > readTimeLineHistory(recoveryTargetTLI), so that the foreach loop
> > *wouldn't* get an empty list.
>
> I might be missing something but I don't agree with this logic. If
> you see prior to this commit the code flow was like below[1]. So my
> point is if we are calling XLogFileReadAnyTLI() somewhere while
> reading the checkpoint record then note that expectedTLEs were
> initialized unconditionally before even try to read that checkpoint
> record. So how expectedTLEs could be uninitialized in
> LogFileReadAnyTLI?

Mmm. I think both of you are right. Before the commit,
XLogFileReadAnyTLI expected that expectedTLEs is initialized. After
the commit it cannot no longer expect that so readTimeLineHistory was
changed to try fetching by itself. *If* an appropriate history file
is found, it *initializes* expectedTLEs with the content.

> [1]
> StartupXLOG(void)
> {
> ....
>
> recoveryTargetTLI = ControlFile->checkPointCopy.ThisTimeLineID;
> ...
> readRecoveryCommandFile();
> ...
> expectedTLEs = readTimeLineHistory(recoveryTargetTLI);
> ...
> ..
> record = ReadCheckpointRecord(checkPointLoc, 0);
>
>
> Another point which I am not sure about but still I think that one
> line (expectedTLEs = readTimeLineHistory(receiveTLI);), somewhere
> related to the purpose of this commit. Let me explain why do I think
> so. Basically, before this commit, we were initializing
> "expectedTLEs" based on the history file of "recoveryTargetTLI", right
> after calling "readRecoveryCommandFile()" (this function will
> initialize recoveryTargetTLI based on the recovery target, and it
> ensures it read the respective history file). Now, right after this
> point, there was a check as shown below[2], which is checking whether
> the checkpoint TLI exists in the "expectedTLEs" which is initialized
> based on recoveryTargetTLI. And it appeared that this check was
> failing in some cases which this commit tried to fix and all other
> code is there to support that. Because now before going for reading
> the checkpoint we are not initializing "expectedTLEs" so now after
> moving this line from here it was possible that "expectedTLEs" is not
> initialized in XLogFileReadAnyTLI() and the remaining code in
> XLogFileReadAnyTLI() is to handle that part.

Before the commit expectedTLEs is always initialized with just one
entry for the TLI of the last checkpoint record.

(1) If XLogFileReadAnyTLI() found the segment but no history file
found, that is, using the dummy TLE-list, expectedTLEs is initialized
with the dummy one-entry list. So there's no behavioral change in this
aspect.

(2) If we didn't find the segment for the checkpoint record, it starts
replication and fetches history files and WAL records then revisits
XLogFileReadAnyTLI. Now we have both the history file and segments,
it successfully reads the recood. The difference of expectedTLEs made
by the patch is having just one entry or the all entries for the past.

Assuming that we keep expectedTLEs synced with recoveryTargetTLI,
rescanLatestTimeLine updates the list properly so no need to worry
about the future. So the issue would be in the past timelines. After
reading the checkpoint record, if we need to rewind to the previous
timeline for the REDO point, the dummy list is inconvenient.

So there is a possibility that the patch fixed the case (2), where the
standby doesn't have both the segment for the checkpoint record and
the history file for the checkpoint, and the REDO point is on the last
TLI. If it is correct, the patch still fails for the case (1), that
is, the issue raised here. Anyway it would be useless (and rahter
harmful) to initialize expectedTLEs based on receiveTLI there.

So my resul there is:

The commit fixed the case (2)
The fix caused the issue for the case (1).
The proposed fix fixes the case (1), caused by the commit.

There's another issue in the case (1) and REDO point is back to the
previous timeline, which is in doubt we need to fix..

> Now, coming to my point that why this one line is related, In this
> one line (expectedTLEs = readTimeLineHistory(receiveTLI);), we
> completely avoiding recoveryTargetTLI and initializing "expectedTLEs"
> based on the history file of the TL from which we read the checkpoint,
> so now, there is no scope of below[2] check to fail because note that
> we are not initializing "expectedTLEs" based on the
> "recoveryTargetTLI" but we are initializing from the history from
> where we read checkpoint.
>
> So I feel if we directly fix this one line to initialize
> "expectedTLEs" from "recoveryTargetTLI" then it will expose to the
> same problem this commit tried to fix.
>
> [2]
> if (tliOfPointInHistory(ControlFile->checkPoint, expectedTLEs) !=
> ControlFile->checkPointCopy.ThisTimeLineID)
> {
> error()
> }

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2021-05-17 03:21:47 Re: prion failed with ERROR: missing chunk number 0 for toast value 14334 in pg_toast_2619
Previous Message Tom Lane 2021-05-17 03:17:46 Re: PG 14 release notes, first draft