Re: [BUG] non archived WAL removed during production crash recovery

From: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org, michael(at)paquier(dot)xyz
Subject: Re: [BUG] non archived WAL removed during production crash recovery
Date: 2020-04-02 13:49:15
Message-ID: 20200402154915.4984cff2@firost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Thu, 2 Apr 2020 19:38:59 +0900
Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:

> On 2020/04/02 16:23, Kyotaro Horiguchi wrote:
> > At Thu, 2 Apr 2020 14:19:15 +0900, Fujii Masao
> > <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
[...]
> >> is whether to remove such WAL files in archive recovery case with
> >> archive_mode=on. Those WAL files would be required when recovering
> >> from the backup taken before that archive recovery happens.
> >> So it seems unsafe to remove them in that case.
> >
> > I'm not sure I'm getting the intention correctly, but I think it
> > responsibility of the operator to provide a complete set of archived
> > WAL files for a backup. Could you elaborate what operation steps are
> > you assuming of?
>
> Please imagine the case where you need to do archive recovery
> from the database snapshot taken while there are many WAL files
> with .ready files. Those WAL files have not been archived yet.
> In this case, ISTM those WAL files should not be removed until
> they are archived, when archive_mode = on.

If you rely on snapshot without pg_start/stop_backup, I agree. Theses WAL
should be archived if:

* archive_mode >= on for primary
* archive_mode = always for standby

> >> Therefore, IMO that the patch should change the code so that
> >> no unarchived WAL files are removed not only in crash recovery
> >> but also archive recovery. Thought?
> >
> > Agreed if "an unarchived WAL" means "a WAL file that is marked .ready"
> > and it should be archived immediately. My previous mail is written
> > based on the same thought.
>
> Ok, so our *current* consensus seems the followings. Right?
>
> - If archive_mode=off, any WAL files with .ready files are removed in
> crash recovery, archive recoery and standby mode.

yes

> - If archive_mode=on, WAL files with .ready files are removed only in
> standby mode. In crash recovery and archive recovery cases, they keep
> remaining and would be archived after recovery finishes (i.e., during
> normal processing).

yes

> - If archive_mode=always, in crash recovery, archive recovery and
> standby mode, WAL files with .ready files are archived if WAL archiver
> is running.

yes

> That is, WAL files with .ready files are removed when either
> archive_mode!=always in standby mode or archive_mode=off.

sounds fine to me.

[...]
> >>>>>> Another is to make the startup process remove .ready file if
> >>>>>> necessary.
> >>>>>
> >>>>> I'm not sure to understand this one.
> >>
> >> I was thinking to make the startup process remove such unarchived WAL
> >> files
> >> if archive_mode=on and StandbyModeRequested/ArchiveRecoveryRequested
> >> is true.

Ok, understood.

> > As mentioned above, I don't understand the point of preserving WAL
> > files that are either marked as .ready or not marked at all on a
> > standby with archive_mode=on.
>
> Maybe yes. But I'm not confident about that there is no such case.

Well, it seems to me that this is what you suggested few paragraph away:

«.ready files are removed when either archive_mode!=always in standby mode»

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Fujii Masao 2020-04-02 14:55:46 Re: [BUG] non archived WAL removed during production crash recovery
Previous Message Jehan-Guillaume de Rorthais 2020-04-02 13:02:34 Re: [BUG] non archived WAL removed during production crash recovery

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2020-04-02 13:58:50 Re: pgbench - add \aset to store results of a combined query
Previous Message Kashif Zeeshan 2020-04-02 13:46:15 Re: WIP/PoC for parallel backup