Re: Remove Deprecated Exclusive Backup Mode

From: David Steele <david(at)pgmasters(dot)net>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Stephen Frost <sfrost(at)snowman(dot)net>, Adrien NAYRAT <adrien(dot)nayrat(at)anayrat(dot)info>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Remove Deprecated Exclusive Backup Mode
Date: 2019-02-26 06:48:58
Message-ID: 8378ce12-6cb0-6fb2-cd0d-0c89232472a0@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/26/19 6:51 AM, Michael Paquier wrote:
> On Mon, Feb 25, 2019 at 08:17:27PM +0200, David Steele wrote:
>> Here's the really obvious bad thing: if users do not update their procedures
>> and we ignore backup_label.pending on startup then they will end up with a
>> corrupt database because it will not replay from the correct checkpoint. If
>> we error on the presence of backup_label.pending then we are right back to
>> where we started.
>
> Not really. If we error on backup_label.pending, we can make the
> difference between a backend which has crashed in the middle of an
> exclusive backup without replaying anything and a backend which is
> started based on a base backup, so an operator can take some action to
> see what's wrong with the server. If you issue an error, users can
> also see that their custom backup script is wrong because they forgot
> to rename the flag after taking a backup of the data folder(s).

The operator still has a decision to make, manually, just as they do
now. The wrong decision may mean a corrupt database.

Here's the scenario:

1) They do a restore, forget to rename backup_label.pending.
2) Postgres won't start, which is the same action we take now.
3) The user is not sure what to do, rename or delete? They delete, and
the cluster is corrupted.

Worse, they have scripted the deletion of backup_label so that the
cluster will restart on crash. This is the recommendation from our
documentation after all. If that script runs after a restore instead of
a crash, then the cluster will be corrupt -- silently.

--
-David
david(at)pgmasters(dot)net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tsunakawa, Takayuki 2019-02-26 06:55:30 RE: [RFC] [PATCH] Flexible "partition pruning" hook
Previous Message Michael Paquier 2019-02-26 06:45:00 Re: Reaping Temp tables to avoid XID wraparound