Re: Race condition in recovery?

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: dilipbalaut(at)gmail(dot)com
Cc: robertmhaas(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, hlinnaka(at)iki(dot)fi
Subject: Re: Race condition in recovery?
Date: 2021-05-27 06:39:29
Message-ID: 20210527.153929.1775309370691180150.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Thu, 27 May 2021 11:44:47 +0530, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote in
> Maybe we can somehow achieve that without a broken archive command,
> but I am not sure how it is enough to just delete WAL from pg_wal? I
> mean my original case was that
> 1. Got the new history file from the archive but did not get the WAL
> file yet which contains the checkpoint after TL switch
> 2. So the standby2 try to stream using new primary using old TL and
> set the wrong TL in expectedTLEs
>
> But if you are not doing anything to stop archiving WAL files or to
> guarantee that WAL has come to archive and you deleted those then I am
> not sure how we are reproducing the original problem.

Thanks for the reply!

We're writing at the very beginning of the switching segment at the
promotion time. So it is guaranteed that the first segment of the
newer timline won't be archived until the rest almost 16MB in the
segment is consumed or someone explicitly causes a segment switch
(including archive timeout).

> BTW, I have also tested your script and I found below log, which shows
> that standby2 is successfully able to select the timeline2 so it is
> not reproducing the issue. Am I missing something?

standby_2? My last one 026_timeline_issue_2.pl doesn't use that name
and uses "standby_1 and "cascade". In the ealier ones, standby_4 and
5 (or 3 and 4 in the later versions) are used in ths additional tests.

So I think it shold be something different?

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2021-05-27 06:43:39 Re: Parallel Inserts in CREATE TABLE AS
Previous Message Amit Kapila 2021-05-27 06:35:57 Re: Forget close an open relation in ReorderBufferProcessTXN()