Re: [BUG] non archived WAL removed during production crash recovery

From: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org, michael(at)paquier(dot)xyz
Subject: Re: [BUG] non archived WAL removed during production crash recovery
Date: 2020-04-02 15:37:44
Message-ID: 20200402173744.0a504243@firost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Thu, 2 Apr 2020 23:58:00 +0900
Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:

> On 2020/04/02 22:02, Jehan-Guillaume de Rorthais wrote:
> > On Thu, 02 Apr 2020 13:07:34 +0900 (JST)
> > Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> >
> >> Sorry, it was quite ambiguous.
> >>
> >> At Thu, 02 Apr 2020 13:04:43 +0900 (JST), Kyotaro Horiguchi
> >> <horikyota(dot)ntt(at)gmail(dot)com> wrote in
> >>> At Wed, 1 Apr 2020 18:17:35 +0200, Jehan-Guillaume de Rorthais
> >>> <jgdr(at)dalibo(dot)com> wrote in
> >>>> Please, find in attachment a patch implementing this.
> >>>
> >>> The patch partially reintroduces the issue the patch have
> >>> fixed. Specifically a standby running a crash recovery wrongly marks a
> >>> WAL file as ".ready" if it is extant in pg_wal without accompanied by
> >>> .ready file.
> >>
> >> The patch partially reintroduces the issue the commit 78ea8b5daa have
> >> fixed. Specifically a standby running a crash recovery wrongly marks a
> >> WAL file as ".ready" if it is extant in pg_wal without accompanied by
> >> .ready file.
> >
> > As far as I understand StartupXLOG(), NOT_IN_RECOVERY and IN_CRASH_RECOVERY
> > are only set for production clusters, not standby ones.
>
> DB_IN_CRASH_RECOVERY can be set even in standby mode. For example,
> if you start the standby from the cold backup of the primary,

In cold backup? Then ControlFile->state == DB_SHUTDOWNED, right?

Unless I'm wrong, this should be catched by:

if (ArchiveRecoveryRequested && ( [...] ||
ControlFile->state == DB_SHUTDOWNED))
{
InArchiveRecovery = true;
if (StandbyModeRequested)
StandbyMode = true;
}

With InArchiveRecovery=true, we later set DB_IN_ARCHIVE_RECOVERY instead of
DB_IN_CRASH_RECOVERY.

> since InArchiveRecovery is false at the beginning of the recovery,
> DB_IN_CRASH_RECOVERY is set in that moment. But then after all the valid
> WAL in pg_wal have been replayed, InArchiveRecovery is set to true and
> DB_IN_ARCHIVE_RECOVERY is set.

However, I suppose this is true if you restore a backup from a snapshot
without backup_label, right?

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jehan-Guillaume de Rorthais 2020-04-02 15:44:50 Re: [BUG] non archived WAL removed during production crash recovery
Previous Message Fujii Masao 2020-04-02 14:58:00 Re: [BUG] non archived WAL removed during production crash recovery

Browse pgsql-hackers by date

  From Date Subject
Next Message Jehan-Guillaume de Rorthais 2020-04-02 15:44:50 Re: [BUG] non archived WAL removed during production crash recovery
Previous Message Robert Haas 2020-04-02 15:24:36 Re: [HACKERS] WAL logging problem in 9.4.3?