Re: Remove Deprecated Exclusive Backup Mode

From: David Steele <david(at)pgmasters(dot)net>
To: Christophe Pettus <xof(at)thebuild(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Remove Deprecated Exclusive Backup Mode
Date: 2019-02-25 18:43:01
Message-ID: ffb32e79-210d-69b8-ef6c-209232d41ece@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Christophe,

On 2/25/19 7:24 PM, Christophe Pettus wrote:
>
>
>> On Feb 25, 2019, at 08:55, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>>
>> I honestly do doubt that they have had the same experiences that I have
>> had
>
> Well, I guarantee you that no two people on this list have had identical experiences. :) I certainly have been bitten by the problems with the current system. But the resistance to major version upgrades is *huge*, and I'm strongly biased against anything that will make that harder. I'm not sure I'm communicating how big a problem telling many large installations, "If you move to v12/13/etc., you will have to change your backup system" is going to be.

I honestly think you are underestimating how bad this can be.

The prevailing wisdom is that it's unfortunate that these backup_labels
get left around but they can be removed with scripting, so no big deal.
After that the cluster will start.

But -- if you are too aggressive about removing the backup_label and
accidentally do it before a real restore from backup, then you have a
corrupt cluster. Totally silent, but definitely corrupt. You'll
probably only see it when you start getting consistency errors from the
indexes, if ever. Page checksums won't catch it either unless you are
*lucky* enough to have a torn page.

Erroneous scripting of this kind can also affect backups that were made
with the non-exclusive method since the backups look the same.

fsync() is the major corruption issue we are facing right now but that
doesn't mean there aren't other sources of corruption we should be
thinking about. I've thought about this one a lot and it scares me.

I've worked on ways to make it better, but all of them break something
and involve compromises that are nearly as severe as removing exclusive
backups entirely.

Regards,
--
-David
david(at)pgmasters(dot)net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-02-25 18:59:36 Re: POC: converting Lists into arrays
Previous Message Mike Palmiotto 2019-02-25 18:41:59 Re: Auxiliary Processes and MyAuxProc