Re: [BUG] non archived WAL removed during production crash recovery

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: jgdr(at)dalibo(dot)com
Cc: masao(dot)fujii(at)oss(dot)nttdata(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org, michael(at)paquier(dot)xyz
Subject: Re: [BUG] non archived WAL removed during production crash recovery
Date: 2020-04-09 02:26:57
Message-ID: 20200409.112657.2065530970880154180.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Hello, Jehan.

At Wed, 8 Apr 2020 15:26:03 +0200, Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com> wrote in
> On Wed, 08 Apr 2020 17:39:09 +0900 (JST)
> Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> > At Tue, 7 Apr 2020 17:17:36 +0200, Jehan-Guillaume de Rorthais
> > <jgdr(at)dalibo(dot)com> wrote in
> > > > +/* Recovery state */
> > > > +typedef enum RecoveryState
> > > > +{
> > > > + NOT_IN_RECOVERY = 0,
> > > > + IN_CRASH_RECOVERY,
> > > > + IN_ARCHIVE_RECOVERY
> > > > +} RecoveryState;
> >
> > I'm not sure the complexity is required here. Are we asuume that
> > archive_mode can be changed before restarting?
>
> I assume it can yes. Eg., one can restore a PITR backup as a standby and change
> the value of archive_mode to either off, on or always.

Thanks. I was confused. The original issue was restarted master can
miss files in archive. To fix that, it's sufficient not ignoring
.ready. It is more than that.

> > At Thu, 2 Apr 2020 15:49:15 +0200, Jehan-Guillaume de Rorthais
> > <jgdr(at)dalibo(dot)com> wrote in
> > > > Ok, so our *current* consensus seems the followings. Right?
> > > >
> > > > - If archive_mode=off, any WAL files with .ready files are removed in
> > > > crash recovery, archive recoery and standby mode.
> > >
> > > yes
> >
> > If archive_mode = off no WAL files are marked as ".ready".
>
> Sure, on the primary side.
>
> What if you build a standby from a backup with archive_mode=on with
> some .ready files in there?

Well. Backup doesn't have nothing in archive_status directory if it is
taken by pg_basebackup. If the backup is created other way, it can
have some (as Fujii-san mentioned). Master with archive_mode != off
and standby with archive_mode=always should archive WAL files that are
not marked .done, but standby with archive_mode == on should not. The
commit intended that but the mistake here is it thinks that inRecovery
represents whether it is running as a standby or not, but actually it
is true on primary during crash recovery.

On the other hand, with the patch, standby with archive_mode=on
wrongly archives WAL files during crash recovery.

What we should check there is, as the commit was intended, not whether
it is under crash or archive recovery, but whether it is running as
primary or standby.

> > If it is "always", WAL files that are to be archived are
> > marked as ".ready". Finally, the condition reduces to:
> >
> > If archiver is running, archive ".ready" files. Otherwise ignore
> > ".ready" and just remove WAL files after use.
> > >
> > > > That is, WAL files with .ready files are removed when either
> > > > archive_mode!=always in standby mode or archive_mode=off.
> > >
> > > sounds fine to me.
> >
> > That situation implies that archive_mode has been changed.
>
> Why? archive_mode may have been "always" on the primary when eg. a snapshot has
> been created.

.ready files are created only when archive_mode != off.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2020-04-09 03:02:21 Re: BUG #16345: ts_headline does not find phrase matches correctly
Previous Message Michael Paquier 2020-04-09 01:48:35 Re: BUG #16351: PostgreSQL closing connection during requests with segmentation fault

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-04-09 03:02:21 Re: BUG #16345: ts_headline does not find phrase matches correctly
Previous Message Alvaro Herrera 2020-04-09 02:12:16 Re: Commitfest 2020-03 Now in Progress