Re: Updated backup APIs for non-exclusive backups

From: David Steele <david(at)pgmasters(dot)net>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Marco Nenciarini <marco(dot)nenciarini(at)2ndquadrant(dot)it>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Updated backup APIs for non-exclusive backups
Date: 2016-03-31 02:00:31
Message-ID: 56FC84BF.1050401@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/30/16 4:18 AM, Magnus Hagander wrote:
>
> On Wed, Mar 30, 2016 at 4:10 AM, David Steele <david(at)pgmasters(dot)net
> <mailto:david(at)pgmasters(dot)net>> wrote:
>
> This certainly looks like it would work but it raises the barrier for
> implementing backups by quite a lot. It's fine for backrest or barman
> but it won't be pleasant for anyone who has home-grown scripts.
>
>
> How much does it really raise the bar, though?
>
> It would go from "copy all files and make damn sure you copy pg_control
> last, and rename it to pg_control.backup" to "take this binary blob you
> got from the server and write it to pg_control.backup"?
>
> Also, the target of these APIs is specifically the backup tools and not
> homewritten scripts.

Then what would home-grown scripts use, the deprecated API that we know
has issues?

> A simple shellscript will have trouble enough using
> it in the first place since it requires a persistent connection to the
> database.

All that's required is to spawn a psql process. I'm no shell expert but
that's simple enough.

> But those scripts are likely broken anyway.

Yes, probably. Backup and especially archiving correctly are harder
than most people realize.

> <...>
>
> The main reason for Heikki to suggest this one over the other basic one
> is that it brings protection against the "backup script/program crashed
> halfway through but the user still tried to restore from that". They will
> outright fail because there is no pg_control.backup in that case.

But if we are going to make this complicated I'm not sure it's a big
deal just to require pg_control to be copied last. pgBackRest already
does that and Barman probably does, too.

I don't see "don't copy pg_control" and "copy pg_control last" as all
that different in terms of complexity.

pgBackRest also *restores* pg_control last which I think is pretty
important and would not be addressed by this solution.

> If we
> don't care about that, then we can go back to just saying "copy
> pg_control last and we're done". But you yourself complained about that
> requirement because it's too easy to get wrong (though you advocated
> using backup_label to transfer the data over -- but that has the
> potential for getting more complicated if we now or at any point in the
> future want more than one field to transfer, for example).

Perhaps. I'm still not convinced that getting some information from
backup_label and other information from pg_control is a good solution.
I would rather write the recovery point into the backup_label and use
that instead of the value in pg_control. Text files are much easier to
parse and push around accurately (and test).

--
-David
david(at)pgmasters(dot)net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2016-03-31 02:19:15 Re: snapshot too old, configured by time
Previous Message Michael Paquier 2016-03-31 01:48:11 Re: Correction for replication slot creation error message in 9.6