Re: Crash on promotion when recovery.conf is renamed

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Crash on promotion when recovery.conf is renamed
Date: 2016-12-15 09:25:10
Message-ID: 6ad88b1b-18e2-5e40-15cb-fef2d477d1ea@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/15/2016 10:44 AM, Magnus Hagander wrote:
> I wonder if there might be more corner cases like this, but in this
> particular one it seems easy enough to just say that failing to rename
> recovery.conf because it didn't exist is safe.

Yeah. It's unexpected though, so I think erroring out is quite
reasonable. If the recovery.conf file went missing, who knows what else
is wrong.

> But in the case of failing to rename recovery.conf for example because of
> permissions errors, we don't want to ignore it. But we also really don't
> want to end up with this kind of inconsistent data directory IMO. I don't
> know that code well enough to suggest how to fix it though -- hoping for
> input for someone who knows it closer?

Hmm. Perhaps we should write the timeline history file only after
renaming recovery.conf. In general, we should keep the window between
writing the timeline history file and writing the end-of-recovery record
as small as possible. I don't think we can eliminate it completely, but
it makes sense to minimize it. Something like the attached (completely
untested).

- Heikki

Attachment Content-Type Size
reorder-end-of-archive-recovery-actions-1.patch invalid/octet-stream 2.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2016-12-15 09:26:30 Re: pg_basebackups and slots
Previous Message Amit Langote 2016-12-15 09:20:04 Re: Transaction oddity with list partition of a list partition