Re: [BUG] non archived WAL removed during production crash recovery

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: michael(at)paquier(dot)xyz
Cc: jgdr(at)dalibo(dot)com, masao(dot)fujii(at)oss(dot)nttdata(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: [BUG] non archived WAL removed during production crash recovery
Date: 2020-04-21 03:09:25
Message-ID: 20200421.120925.76453535156648004.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

At Tue, 21 Apr 2020 11:15:01 +0900, Michael Paquier <michael(at)paquier(dot)xyz> wrote in
> On Mon, Apr 20, 2020 at 02:22:35PM +0200, Jehan-Guillaume de Rorthais wrote:
> > The problem is that we would have to read the controldata file each time we
> > wonder if a segment should be archived/removed. Moreover, the controldata
> > file might not be in sync quickly enough with the real state for some other
> > code path or futur needs.
>
> I don't think that this is what Horiguchi-san meant here. What I got
> from his previous message would be to be to copy the shared value from
> the control file when necessary, and have the shared state use only a
> subset of the existing values of DBState, aka:
> - DB_IN_CRASH_RECOVERY
> - DB_IN_ARCHIVE_RECOVERY
> - DB_IN_PRODUCTION

First I thought as above, but I thought that we could use
ControlFile->state itself in this case, by regarding the symbols less
than DB_IN_CRASH_RECOVERY as RECOVERY_STATE_CRASH. I don't think
there's no problem if the update to DB_IN_ARCHIVE_RECOVERY reaches
checkpointer with some delay.

> Still, that sounds wrong to me because then somebody would be tempted
> to change the shared value thinking that things like DB_SHUTDOWNING,
> DB_SHUTDOWNED_* or DB_STARTUP are valid but we don't want that here.

That is not an issue if we just use DBState to know whether we have
started archive recovery.

> Note that there may be a case for DB_STARTUP to be used in
> XLOGShmemInit(), but I'd rather let the code use the safest default,
> DB_IN_CRASH_RECOVERY to control that we won't remove .ready files by
> default until the startup process sees fit to do the actual switch
> depending on the checkpoint record lookup, if archive recovery was
> actually requested, etc.

I'm not sure I read this correctly. But I think I agree to this.

+ if (!XLogArchivingAlways() &&
+ GetRecoveryState() == RECOVERY_STATE_ARCHIVE)

Is rewritten as

+ if (!XLogArchivingAlways() &&
+ GetDBState() > DB_IN_CRASH_RECOVERY)

FWIW, what annoyed me is there are three variables that are quite
similar but has different domains, ControlFile->state,
XLogCtl->SharedRecoveryState, and LocalRecoveryInProgress. I didn't
mind there were two, but three seems a bit too many to me.

But it may be different issue.

> > Indeed, Benoît Lobréau reported this behavior to me.
>
> Noted. Thanks for the information. I don't think that I have ever
> met Benoît in person, do I? Tell him that I owe him one beer or a
> beverage of his choice when we meet IRL, and that he had better use
> this message-id to make me keep my promise :)

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message cbw 2020-04-21 04:07:49 Backend stuck in tirigger.c:afterTriggerInvokeEvents forever
Previous Message Michael Paquier 2020-04-21 02:15:01 Re: [BUG] non archived WAL removed during production crash recovery

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2020-04-21 04:24:29 Re: fixing old_snapshot_threshold's time->xid mapping
Previous Message David Rowley 2020-04-21 03:03:18 Re: Parallel Append can break run-time partition pruning