Re: Race condition in recovery?

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: robertmhaas(at)gmail(dot)com
Cc: dilipbalaut(at)gmail(dot)com, hlinnaka(at)iki(dot)fi, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Race condition in recovery?
Date: 2021-05-21 07:49:24
Message-ID: 20210521.164924.2279362546489183892.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Fri, 21 May 2021 11:21:05 +0900 (JST), Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote in
> At Thu, 20 May 2021 13:49:10 -0400, Robert Haas <robertmhaas(at)gmail(dot)com> wrote in
> In the case of (c) recoveryTargetTLI > checkpoint TLI. In this case
> we expecte that checkpint TLI is in the history of
> recoveryTargetTLI. Otherwise recovery failse. This case is similar
> to the case (a) but the relationship between recoveryTargetTLI and
> the checkpoint TLI is not confirmed yet. ReadRecord barks later if
> they are not compatible so there's not a serious problem but might
> be better checking the relation ship there. My first proposal
> performed mutual check between the two but we need to check only
> unidirectionally.
>
> if (readFile < 0)
> {
> if (!expectedTLEs)
> {
> expectedTLEs = readTimeLineHistory(receiveTLI);
> + if (!tliOfPointInHistory(receiveTLI, expectedTLEs))
> + ereport(ERROR, "the received timeline %d is not found in the history file for timeline %d");
>
>
> > > Conclusion:
> > > - I think now we agree on the point that initializing expectedTLEs
> > > with the recovery target timeline is the right fix.
> > > - We still have some differences of opinion about what was the
> > > original problem in the base code which was fixed by the commit
> > > (ee994272ca50f70b53074f0febaec97e28f83c4e).
> >
> > I am also still concerned about whether we understand in exactly what
> > cases the current logic doesn't work. We seem to more or less agree on
> > the fix, but I don't think we really understand precisely what case we
> > are fixing.
>
> Does the discussion above make sense?

This is a revised version.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v3-0001-Set-expectedTLEs-correctly-based-on-recoveryTarge.patch text/x-patch 5.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message houzj.fnst@fujitsu.com 2021-05-21 07:49:55 RE: Fdw batch insert error out when set batch_size > 65535
Previous Message Amit Langote 2021-05-21 07:42:42 Re: Forget close an open relation in ReorderBufferProcessTXN()