Re: Race condition in recovery?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Race condition in recovery?
Date: 2021-05-17 19:58:47
Message-ID: CA+Tgmob37PEkGpT8sHJv-hfmS+FbEs2+uZ9Jf1EUkG5TOTFUTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, May 15, 2021 at 1:25 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > As I understand it, the general issue here was that if
> > XLogFileReadAnyTLI() was called before expectedTLEs got set, then
> > prior to this commit it would have to fail, because the foreach() loop
> > in that function would be iterating over an empty list. Heikki tried
> > to make it not fail in that case, by setting tles =
> > readTimeLineHistory(recoveryTargetTLI), so that the foreach loop
> > *wouldn't* get an empty list.
>
> I might be missing something but I don't agree with this logic. If
> you see prior to this commit the code flow was like below[1]. So my
> point is if we are calling XLogFileReadAnyTLI() somewhere while
> reading the checkpoint record then note that expectedTLEs were
> initialized unconditionally before even try to read that checkpoint
> record. So how expectedTLEs could be uninitialized in
> LogFileReadAnyTLI?

Sorry, you're right. It couldn't be uninitialized, but it could be a
fake 1-element list saying there are no ancestors rather than the real
value. So I think the point was to avoid that.

> Another point which I am not sure about but still I think that one
> line (expectedTLEs = readTimeLineHistory(receiveTLI);), somewhere
> related to the purpose of this commit. Let me explain why do I think
> so. Basically, before this commit, we were initializing
> "expectedTLEs" based on the history file of "recoveryTargetTLI", right
> after calling "readRecoveryCommandFile()" (this function will
> initialize recoveryTargetTLI based on the recovery target, and it
> ensures it read the respective history file). Now, right after this
> point, there was a check as shown below[2], which is checking whether
> the checkpoint TLI exists in the "expectedTLEs" which is initialized
> based on recoveryTargetTLI. And it appeared that this check was
> failing in some cases which this commit tried to fix and all other
> code is there to support that. Because now before going for reading
> the checkpoint we are not initializing "expectedTLEs" so now after
> moving this line from here it was possible that "expectedTLEs" is not
> initialized in XLogFileReadAnyTLI() and the remaining code in
> XLogFileReadAnyTLI() is to handle that part.

I think the issue here is: If expectedTLEs was initialized before the
history file was available, and contained a dummy 1-element list, then
tliOfPointInHistory() is going to say that every LSN is on that
timeline rather than any previous timeline. And if we are supposed to
be switching timelines then that will lead to this sanity check
failing.

> Now, coming to my point that why this one line is related, In this
> one line (expectedTLEs = readTimeLineHistory(receiveTLI);), we
> completely avoiding recoveryTargetTLI and initializing "expectedTLEs"
> based on the history file of the TL from which we read the checkpoint,
> so now, there is no scope of below[2] check to fail because note that
> we are not initializing "expectedTLEs" based on the
> "recoveryTargetTLI" but we are initializing from the history from
> where we read checkpoint.

I agree, but that's actually bad, isn't it? I mean if we want the
sanity check to never fail we can just take it out. What we want to
happen is that the sanity check should pass if the startup timeline if
the TLI of the startup checkpoint is in the history of the recovery
target timeline, but fail if it isn't. The only way to achieve that
behavior is if expectedTLEs is initialized from the recovery target
timeline.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2021-05-17 20:10:39 Re: pg_hba.conf.sample wording improvement
Previous Message Magnus Hagander 2021-05-17 19:56:23 Re: Winflex docs and distro