Re: "using previous checkpoint record at" maybe not the greatest idea?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: "using previous checkpoint record at" maybe not the greatest idea?
Date: 2016-02-03 14:28:24
Message-ID: CA+TgmoZK77m+H3ZZ_9n+figFjDJyOru8xoWybNJEnHZzanZV-w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 1, 2016 at 6:58 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> currently if, when not in standby mode, we can't read a checkpoint
> record, we automatically fall back to the previous checkpoint, and start
> replay from there.
>
> Doing so without user intervention doesn't actually seem like a good
> idea. While not super likely, it's entirely possible that doing so can
> wreck a cluster, that'd otherwise easily recoverable. Imagine e.g. a
> tablespace being dropped - going back to the previous checkpoint very
> well could lead to replay not finishing, as the directory to create
> files in doesn't even exist.
>
> As there's, afaics, really no "legitimate" reasons for needing to go
> back to the previous checkpoint I don't think we should do so in an
> automated fashion.
>
> All the cases where I could find logs containing "using previous
> checkpoint record at" were when something else had already gone pretty
> badly wrong. Now that obviously doesn't have a very large significance,
> because in the situations where it "just worked" are unlikely to be
> reported...
>
> Am I missing a reason for doing this by default?

I agree: this seems like a terrible idea. Would we still have some
way of forcing the older checkpoint record to be used if somebody
wants to try to do that?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2016-02-03 14:29:05 Re: Raising the checkpoint_timeout limit
Previous Message Michael Paquier 2016-02-03 14:28:18 Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby