Re: Crash on promotion when recovery.conf is renamed

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Crash on promotion when recovery.conf is renamed
Date: 2016-12-17 12:23:54
Message-ID: CABUevEzEoXQAc0ifyVF7XYpPrXOCt3P2_B465bLO4W_uFQirbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 16, 2016 at 7:08 AM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:

> On Thu, Dec 15, 2016 at 11:25:10AM +0200, Heikki Linnakangas wrote:
> > On 12/15/2016 10:44 AM, Magnus Hagander wrote:
> > > I wonder if there might be more corner cases like this, but in this
> > > particular one it seems easy enough to just say that failing to rename
> > > recovery.conf because it didn't exist is safe.
> >
> > Yeah. It's unexpected though, so I think erroring out is quite
> reasonable.
> > If the recovery.conf file went missing, who knows what else is wrong.
>
> i'd rather not mess up with the exiting behavior and just complain to the
> pilot to not do that. This enters in the category of not removing the
> backup_label file after taking a backup...
>

I'm happy to say that people shouldn't do that. However, we have a system
that is unnecessarily fragile, by making something like that so easy to
break. This could equally happen in other ways. For example, someone could
accidentally provision a recovery.conf with the wrong permissions.

Reducing the fragility of such an important part of the system is a big
improvement, IMO. Of course, we have to be extra careful when touching
those parts.

Bottom line, I'm perfectly fine with failing in such a scenario. I'm not
fine with leaving the user with a corrupt cluster if we can avoid it.

As for your comparison, we fixed the backup_label part by adding support
for taking backups without using the fragile system. So it's not like we
didn't recognize the problem.

>
> > > But in the case of failing to rename recovery.conf for example because
> of
> > > permissions errors, we don't want to ignore it. But we also really
> don't
> > > want to end up with this kind of inconsistent data directory IMO. I
> don't
> > > know that code well enough to suggest how to fix it though -- hoping
> for
> > > input for someone who knows it closer?
> >
> > Hmm. Perhaps we should write the timeline history file only after
> renaming
> > recovery.conf. In general, we should keep the window between writing the
> > timeline history file and writing the end-of-recovery record as small as
> > possible. I don't think we can eliminate it completely, but it makes
> sense
> > to minimize it. Something like the attached (completely untested).
>

I haven't looked at the code either, but the reason is definitely solid -
let's keep the time as short as possible. In particular, keeping it
recoverable for things that are caused by humans, because they will keep
making mistakes. And editing recovery.conf is a normal thing for a DBA to
do (sure, not removing it -- but it's one of the few files in the data
directory that the DBA is actually supposed to handle manually, so that
increases the risk).

And also by doing things in an order that will at least make it less likely
to end up corrupt if people manage to do it anyway.

I did some tests. And after a lookup it looks like a good thing to book
> the timeline number at an earlier step and let other nodes know about
> it. So +1 on it.
>

>
> Looking at PrescanPreparedTransactions(), I am thinking as well that it
> would
> be better to get a hard failure when bumping on a corrupted 2PC file.
> Future
> files are one thing, but corrupted files should be treated more carefully.
>

Again without looking at it, I agree (so much easier that way :P). Ignoring
corruption is generally a bad idea. Failing hard makes the user notice the
error, and makes it possible to initiate recovery from a standby or from
backups or something, or to *intentionally* remove/clear/ignore it.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Erik Rijkers 2016-12-17 12:37:21 Re: Logical Replication WIP
Previous Message Ashutosh Sharma 2016-12-17 11:23:09 Re: Hang in pldebugger after git commit : 98a64d0