Re: block-level incremental backup

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: block-level incremental backup
Date: 2019-09-17 16:58:23
Message-ID: CA+TgmoYovA4Oz4fJvJt8LoiCt2HZ4+gKNJQLF_VakS=doOPz8A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 17, 2019 at 12:09 PM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> We need to be sure that we can detect if the WAL level has ever been set
> to minimal between a full and an incremental and, if so, either refuse
> to run the incremental, or promote it to a full, or make it a
> checksum-based incremental instead of trusting the WAL stream.

Sure. What about checksum collisions?

> Just to be clear, I see your points and I like the general idea of
> finding solutions, but it seems like the issues are likely to be pretty
> complex and I'm not sure that's being appreciated very well.

Definitely possible, but it's more helpful if you can point out the
actual issues.

> Ok.. I can understand that but I don't get how these changes to
> pg_basebackup will help facilitate that. If they don't and what you're
> talking about here is independent, then great, that clarifies things,
> but if you're saying that these changes to pg_basebackup are to help
> with backing up directly into those Enterprise systems then I'm just
> asking for some help in understanding how- what's the use-case here that
> we're adding to pg_basebackup that makes it work with these Enterprise
> systems?
>
> I'm not trying to be difficult here, I'm just trying to understand.

Man, I feel like we're totally drifting off into the weeds here. I'm
not arguing that these changes to pg_basebackup will help enterprise
users except insofar as those users want incremental backup. All of
this discussion started with this comment from you:

"Having a system of keeping track of which backups are full and which
are differential in an overall system also gives you the ability to do
things like expiration in a sensible way, including handling WAL
expiration."

All I was doing was saying that for an enterprise user, the overall
system might be something entirely outside of our control, like
NetBackup or Tivoli. Therefore, whatever functionality we provide to
do that kind of thing should be able to be used in such contexts. That
hardly seems like a controversial proposition.

> How would that tool work, if it's to be able to work regardless of where
> the WAL is actually stored..? Today, pg_archivecleanup just works
> against a POSIX filesystem- are you thinking that the tool would have a
> pluggable storage system, so that it could work with, say, a POSIX
> filesystem, or a CIFS mount, or a s3-like system?

Again, I was making a general statement about design goals -- "we
should try to work nicely with enterprise backup products" -- not
proposing a specific design for a specific thing. I don't think the
idea of some pluggability in that area is a bad one, but it's not even
slightly what this thread is about.

> Provided the WAL level is at the level that you need it to be that will
> be true for things which are actually supported with PITR, replication
> to standby servers, et al. I can see how it might come across as an
> overreaction but this strikes me as a pretty glaring issue and I worry
> that if it was overlooked until now that there'll be other more subtle
> issues, and backups are just plain complicated to get right, just to
> begin with already, something that I don't think people appreciate until
> they've been dealing with them for quite a while.

Permit me to be unpersuaded. If it was such a glaring issue, and if
experience is the key to spotting such issues, then why didn't YOU
spot it?

I'm not arguing that this stuff isn't hard. It is. Nor am I arguing
that I didn't screw up. I did. But designs need to be accepted or
rejected based on facts, not FUD. You've raised some good technical
points and if you've got more concerns, I'd like to hear them, but I
don't think arguing vaguely that a certain approach will probably run
into trouble gets us anywhere.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-09-17 17:00:09 Re: Leakproofness of texteq()/textne()
Previous Message Pavel Stehule 2019-09-17 16:50:35 strong memory leak in plpgsql from handled rollback and lazy cast