Re: Document what is essential and undocumented in pg_basebackup

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Chapman Flack <chap(at)anastigmatix(dot)net>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>, David Steele <david(at)pgmasters(dot)net>
Subject: Re: Document what is essential and undocumented in pg_basebackup
Date: 2022-03-09 19:46:00
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


* Chapman Flack (chap(at)anastigmatix(dot)net) wrote:
> On 03/09/22 12:19, Stephen Frost wrote:
> > Let's avoid hijacking [thread about other patch] [1]
> > for an independent debate about what our documentation should or
> > shouldn't include.
> Agreed. New thread here.


> Stephen wrote:
> > Documenting everything that pg_basebackup does to make sure that the
> > backup is viable might be something to work on if someone is really
> > excited about this, but it's not 'dead-simple' and it's darn close to
> > the bare minimum,
> I wrote:
> > if the claim is that an admin who relies on pg_basebackup is relying
> > on essential things pg_basebackup does that have not been enumerated
> > in our documentation yet, I would argue they should be.
> Magnus wrote:
> > For the people who want to drive their backups from a shellscript and
> > for some reason *don't* want to use pg_basebackup, we need to come up
> > with a different API or a different set of tools. That is not a
> > documentation task. That is a "start from a list of which things
> > pg_basebackup cannot do that are still simple, or that tools like
> > pgbackrest cannot do if they're complicated". And then design an API
> > that's actually safe and easy to use *for that usecase*.
> I wrote:
> > That might also be a good thing, but I don't see it as a substitute
> > for documenting the present reality of what the irreducibly essential
> > behaviors of pg_basebackup (or of third-party tools like pgbackrest)
> > are, and why they are so.
> Stephen wrote:
> > I disagree. If we provided a tool then we'd document that tool and how
> > users can use it, not every single step that it does (see also:
> > pg_basebackup).
> I could grant, arguendo, that for most cases where we've "provided a tool"
> that's enough, and still distinguish pg_basebackup from those. In no
> particular order:
> - pg_basebackup comes late to the party. It appears in 9.1 as a tool that
> conveniently automates a process (performing an online base backup)
> that has already been documented since 8.0 six and a half years earlier.
> (While, yes, it streams the file contents over a newly-introduced
> protocol, I don't think anyone has called that one of its irreducibly
> essential behaviors, or claimed that any other way of reliably copying
> those contents during the backup window would be inherently flawed.)
> - By the release where pg_basebackup appears, anyone who is doing
> online backup and PITR is already using some other tooling (third-party
> or locally developed) to do so. There may be benefits and costs in
> migrating those procedures to pg_basebackup. If one of the benefits is
> "your current procedures may be missing essential steps we built into
> pg_basebackup but left out of our documentation" then that is important
> to know for an admin who is making that decision. Even better, knowing
> what those essential steps are will allow that admin to make an informed
> assessment of whether the existing procedures are broken or not.
> - Typical tools are easy for an admin to judge the fitness of.
> The tool does a thing, and you can tell right away if it did the thing
> you needed or not. pg_basebackup, like any backup tool, does a thing,
> and you don't find out if that was the thing you needed until later,
> when failure isn't an option. That's a less-typical kind of a tool,
> for which it's less ok to be a black box.
> - Ultimately, an admin's job isn't "use pg_basebackup" (or "use pgbackrest"
> or "use barman"). The job is "be certain that this cluster is recoverably
> backed up, and for any tool you may be using to do it, that you have the
> same grasp of what the tool has done as if you had done it yourself."
> In view of all that, I would think it perfectly reasonable to present
> pg_basebackup as one convenient and included reference implementation
> of the irreducibly essential steps of an online base backup, which we
> separately document.

... except that pg_basebackup isn't quite that, it just happens to do
the things that *it* needs to do to give some level of confidence that
the backup it took will be useable later.

> I don't think it is as reasonable to say, effectively, that you learn
> what the irreducibly essential steps of an online base backup are by
> reading the source of pg_basebackup, and then intuiting which of the
> details you find there are the essential ones and which are outgrowths
> of its particular design choices.

While reading the pg_basebackup source would be helpful to someone
developing a new backup tool for PG, it's not the only source you'd need
to read- you also need to read the PG source for things like what return
codes from archive_command and restore_command mean to PG or how a
promoted system finds a new timeline or what .partial or .backup files
mean. Further, you'd need to understand that it's essential that all of
the files from the backup are fsync'd to disk along with the directories
that they're in (which is something you might glean from reading the
pg_basebackup source) as otherwise they might disappear if a crash
happened shortly after the backup was taken. Same for how
archive_command has to handle that same concern for WAL files. Not to
mention the considerations around how to deal with page-level checksums
when reading from an actively-being-modified PG data directory.

Documenting absolutely everything needed to write a good backup tool for
PG strikes me as unlikely to end up actually being useful. Those who
write backup tools for PG are reading the source for PG and likely
wouldn't find such documentation helpful as not everything needed would
be included even if we did try to document everything, making such an
effort a waste of time. The idea that we could document everything
needed and that someone could then write a simple shell script or even a
simple perl script (as pgbackrest started out as ...) from that
documentation that did everything necessary is a fiction that we need to
accept as such and move on from.



In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Gilles Darold 2022-03-09 19:53:48 Re: [Proposal] vacuumdb --schema only
Previous Message Brar Piening 2022-03-09 19:43:45 Re: Add id's to various elements in protocol.sgml