Re: [BUG] non archived WAL removed during production crash recovery

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org, michael(at)paquier(dot)xyz
Subject: Re: [BUG] non archived WAL removed during production crash recovery
Date: 2020-04-03 06:44:40
Message-ID: 4c4eccd5-04f2-f570-968d-9f50ee912961@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On 2020/04/03 0:37, Jehan-Guillaume de Rorthais wrote:
> On Thu, 2 Apr 2020 23:58:00 +0900
> Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>
>> On 2020/04/02 22:02, Jehan-Guillaume de Rorthais wrote:
>>> On Thu, 02 Apr 2020 13:07:34 +0900 (JST)
>>> Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:
>>>
>>>> Sorry, it was quite ambiguous.
>>>>
>>>> At Thu, 02 Apr 2020 13:04:43 +0900 (JST), Kyotaro Horiguchi
>>>> <horikyota(dot)ntt(at)gmail(dot)com> wrote in
>>>>> At Wed, 1 Apr 2020 18:17:35 +0200, Jehan-Guillaume de Rorthais
>>>>> <jgdr(at)dalibo(dot)com> wrote in
>>>>>> Please, find in attachment a patch implementing this.
>>>>>
>>>>> The patch partially reintroduces the issue the patch have
>>>>> fixed. Specifically a standby running a crash recovery wrongly marks a
>>>>> WAL file as ".ready" if it is extant in pg_wal without accompanied by
>>>>> .ready file.
>>>>
>>>> The patch partially reintroduces the issue the commit 78ea8b5daa have
>>>> fixed. Specifically a standby running a crash recovery wrongly marks a
>>>> WAL file as ".ready" if it is extant in pg_wal without accompanied by
>>>> .ready file.
>>>
>>> As far as I understand StartupXLOG(), NOT_IN_RECOVERY and IN_CRASH_RECOVERY
>>> are only set for production clusters, not standby ones.
>>
>> DB_IN_CRASH_RECOVERY can be set even in standby mode. For example,
>> if you start the standby from the cold backup of the primary,
>
> In cold backup? Then ControlFile->state == DB_SHUTDOWNED, right?
>
> Unless I'm wrong, this should be catched by:
>
> if (ArchiveRecoveryRequested && ( [...] ||
> ControlFile->state == DB_SHUTDOWNED))
> {
> InArchiveRecovery = true;
> if (StandbyModeRequested)
> StandbyMode = true;
> }
>
> With InArchiveRecovery=true, we later set DB_IN_ARCHIVE_RECOVERY instead of
> DB_IN_CRASH_RECOVERY.

Yes, you're right. So I had to mention one more condition in my
previous email. The condition is that the cold backup was taken from
the server that was shutdowned with immdiate mode. In this case,
the code block that you pointed is skipped and InArchiveRecovery is
not set to true there.

>> since InArchiveRecovery is false at the beginning of the recovery,
>> DB_IN_CRASH_RECOVERY is set in that moment. But then after all the valid
>> WAL in pg_wal have been replayed, InArchiveRecovery is set to true and
>> DB_IN_ARCHIVE_RECOVERY is set.
>
> However, I suppose this is true if you restore a backup from a snapshot
> without backup_label, right?

Maybe yes.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Fujii Masao 2020-04-03 06:45:31 Re: [BUG] non archived WAL removed during production crash recovery
Previous Message Artur Zakirov 2020-04-03 03:33:00 Re: BUG #16337: Finnish Ispell dictionary cannot be created

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-04-03 06:45:31 Re: [BUG] non archived WAL removed during production crash recovery
Previous Message Amit Kapila 2020-04-03 05:59:52 Re: User Interface for WAL usage data