Re: Race condition in recovery?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Race condition in recovery?
Date: 2021-05-21 16:14:42
Message-ID: CA+TgmoZoUR4MfaJOLe3k0nfSwrLTTrwN+tvtrErjqzP1BWGSQw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 21, 2021 at 10:39 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > so we might have
> > the timeline history in RECOVERYHISTORY but that's not the filename
> > we're actually going to try to read from inside readTimeLineHistory().
> > In the second case, findNewestTimeLine() will call
> > existsTimeLineHistory() which results in the same situation. So I
> > think when readRecoveryCommandFile() returns expectedTLI can be
> > different but the history file can be absent since it was only ever
> > restored under a temporary name.
>
> I agree that readTimeLineHistory() will not look for that filename,
> but it will also try to get the file using (RestoreArchivedFile(path,
> histfname, "RECOVERYHISTORY", 0, false)). So after we check the
> history file existence in existsTimeLineHistory(), if the file got
> removed from the archive (not sure how) then it is possible that now
> readTimeLineHistory() will not find that history file again. Am I
> missing something?

That sounds right.

I've lost the thread of what we're talking about here a bit. I think
what we've established is that, when running a commit prior to
ee994272ca50f70b53074f0febaec97e28f83c4e, if (a) recovery_target_tli
is set, (b) restore_command works, and (c) nothing's being removed
from the archive concurrently, then by the time StartupXLOG() does
expectedTLEs = readTimeLineHistory(recoveryTargetTLI), any timeline
history file that exists in the archive will have been restored, and
the scenario ee994272ca50f70b53074f0febaec97e28f83c4e was concerned
about won't occur. That's because it was concerned about a scenario
where we failed to restore the history file until after we set
expectedTLEs.

Consequently, if we want to try to reproduce the problem fixed by that
commit, we should look for a scenario that does not involve setting
recovery_target_tli.

Is that the conclusion you were driving towards?

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2021-05-21 16:17:01 Re: Performance degradation of REFRESH MATERIALIZED VIEW
Previous Message Dmitry Dolgov 2021-05-21 15:31:38 Re: Index Skip Scan (new UniqueKeys)