Re: block-level incremental backup

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: block-level incremental backup
Date: 2019-04-15 16:48:57
Message-ID: 20190415164857.nosv5foqbcmeyhcl@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 15, 2019 at 09:01:11AM -0400, Stephen Frost wrote:
> Greetings,
>
> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> > Several companies, including EnterpriseDB, NTT, and Postgres Pro, have
> > developed technology that permits a block-level incremental backup to
> > be taken from a PostgreSQL server. I believe the idea in all of those
> > cases is that non-relation files should be backed up in their
> > entirety, but for relation files, only those blocks that have been
> > changed need to be backed up.
>
> I love the general idea of having additional facilities in core to
> support block-level incremental backups. I've long been unhappy that
> any such approach ends up being limited to a subset of the files which
> need to be included in the backup, meaning the rest of the files have to
> be backed up in their entirety. I don't think we have to solve for that
> as part of this, but I'd like to see a discussion for how to deal with
> the other files which are being backed up to avoid needing to just
> wholesale copy them.

I assume you are talking about non-heap/index files. Which of those are
large enough to benefit from incremental backup?

> > I would like to propose that we should
> > have a solution for this problem in core, rather than leaving it to
> > each individual PostgreSQL company to develop and maintain their own
> > solution.
>
> I'm certainly a fan of improving our in-core backup solutions.
>
> I'm quite concerned that trying to graft this on to pg_basebackup
> (which, as you note later, is missing an awful lot of what users expect
> from a real backup solution already- retention handling, parallel
> capabilities, WAL archive management, and many more... but also is just
> not nearly as developed a tool as the external solutions) is going to
> make things unnecessairly difficult when what we really want here is
> better support from core for block-level incremental backup for the
> existing external tools to leverage.

I think there is some interesting complexity brought up in this thread.
Which options are going to minimize storage I/O, network I/O, have only
background overhead, allow parallel operation, integrate with
pg_basebackup. Eventually we will need to evaluate the incremental
backup options against these criteria.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-04-15 16:55:33 Re: Multivariate MCV lists -- pg_mcv_list_items() seems to be broken
Previous Message Daniel Verite 2019-04-15 16:35:12 Re: Cleanup/remove/update references to OID column