Re: trying again to get incremental backup

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: trying again to get incremental backup
Date: 2023-10-04 20:08:29
Message-ID: CA+TgmoYw=8CnKEPQX9qk2estSTCaB7OGcnVM8oa-hAePG2O4Yw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 3, 2023 at 2:21 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Here's a new patch set, also addressing Jakub's observation that
> MINIMUM_VERSION_FOR_WAL_SUMMARIES needed updating.

Here's yet another new version. In this version, I reversed the order
of the first two patches, with the idea that what's now 0001 seems
fairly reasonable as an independent commit, and could thus perhaps be
committed sometime soon-ish. In the main patch, I added SGML
documentation for pg_combinebackup. I also fixed the broken TAP tests
so that they work, by basing them on pg_dump equivalence rather than
file-level equivalence. I'm sad to give up on testing the latter, but
it seems to be unrealistic. I cleaned up a few other odds and ends,
too. But, what exactly is the bigger picture for this patch in terms
of moving forward? Here's a list of things that are on my mind:

- I'd like to get the patch to mark the redo point in the WAL
committed[1] and then reword this patch set to make use of that
infrastructure. Right now, we make a best effort to end WAL summaries
at redo point boundaries, but it's racey, and sometimes we fail to do
so. In theory that just has the effect of potentially making an
incremental backup contain some extra blocks that it shouldn't really
need to contain, but I think it can actually lead to weird stalls,
because when an incremental backup is taken, we have to wait until a
WAL summary shows up that extends at least up to the start LSN of the
backup we're about to take. I believe all the logic in this area can
be made a good deal simpler and more reliable if that patch gets
committed and this one reworked accordingly.

- I would like some feedback on the generation of WAL summary files.
Right now, I have it enabled by default, and summaries are kept for a
week. That means that, with no additional setup, you can take an
incremental backup as long as the reference backup was taken in the
last week. File removal is governed by mtimes, so if you change the
mtimes of your summary files or whack your system clock around, weird
things might happen. But obviously this might be inconvenient. Some
people might not want WAL summary files to be generated at all because
they don't care about incremental backup, and other people might want
them retained for longer, and still other people might want them to be
not removed automatically or removed automatically based on some
criteria other than mtime. I don't really know what's best here. I
don't think the default policy that the patches implement is
especially terrible, but it's just something that I made up and I
don't have any real confidence that it's wonderful. One point to be
consider here is that, if WAL summarization is enabled, checkpoints
can't remove WAL that isn't summarized yet. Mostly that's not a
problem, I think, because the WAL summarizer is pretty fast. But it
could increase disk consumption for some people. I don't think that we
need to worry about the summaries themselves being a problem in terms
of space consumption; at least in all the cases I've tested, they're
just not very big.

- On a related note, I haven't yet tested this on a standby, which is
a thing that I definitely need to do. I don't know of a reason why it
shouldn't be possible for all of this machinery to work on a standby
just as it does on a primary, but then we need the WAL summarizer to
run there too, which could end up being a waste if nobody ever tries
to take an incremental backup. I wonder how that should be reflected
in the configuration. We could do something like what we've done for
archive_mode, where on means "only on if this is a primary" and you
have to say always if you want it to run on standbys as well ... but
I'm not sure if that's a design pattern that we really want to
replicate into more places. I'd be somewhat inclined to just make
whatever configuration parameters we need to configure this thing on
the primary also work on standbys, and you can set each server up as
you please. But I'm open to other suggestions.

- We need to settle the question of whether to send the whole backup
manifest to the server or just the LSN. In a previous attempt at
incremental backup, we decided the whole manifest was necessary,
because flat-copying files could make new data show up with old LSNs.
But that version of the patch set was trying to find modified blocks
by checking their LSNs individually, not by summarizing WAL. And since
the operations that flat-copy files are WAL-logged, the WAL summary
approach seems to eliminate that problem - maybe an LSN (and the
associated TLI) is good enough now. This also relates to Jakub's
question about whether this machinery could be used to fast-forward a
standby, which is not exactly a base backup but ... perhaps close
enough? I'm somewhat inclined to believe that we can simplify to an
LSN and TLI; however, if we do that, then we'll have big problems if
later we realize that we want the manifest for something after all. So
if anybody thinks that there's a reason to keep doing what the patch
does today -- namely, upload the whole manifest to the server --
please speak up.

- It's regrettable that we don't have incremental JSON parsing; I
think that means anyone who has a backup manifest that is bigger than
1GB can't use this feature. However, that's also a problem for the
existing backup manifest feature, and as far as I can see, we have no
complaints about it. So maybe people just don't have databases with
enough relations for that to be much of a live issue yet. I'm inclined
to treat this as a non-blocker, although Andrew Dunstan tells me he
does have a prototype for incremental JSON parsing so maybe that will
land and we can use it here.

- Right now, I have a hard-coded 60 second timeout for WAL
summarization. If you try to take an incremental backup and the WAL
summaries you need don't show up within 60 seconds, the backup times
out. I think that's a reasonable default, but should it be
configurable? If yes, should that be a GUC or, perhaps better, a
pg_basebackup option?

- I'm curious what people think about the pg_walsummary tool that is
included in 0006. I think it's going to be fairly important for
debugging, but it does feel a little bit bad to add a new binary for
something pretty niche. Nevertheless, merging it into any other
utility seems relatively awkward, so I'm inclined to think both that
this should be included in whatever finally gets committed and that it
should be a separate binary. I considered whether it should go in
contrib, but we seem to have moved to a policy that heavily favors
limiting contrib to extensions and loadable modules, rather than
binaries.

Clearly there's a good amount of stuff to sort out here, but we've
still got quite a bit of time left before feature freeze so I'd like
to have a go at it. Please let me know your thoughts, if you have any.

[1] http://postgr.es/m/CA+TgmoZAM24Ub=uxP0aWuWstNYTUJQ64j976FYJeVaMJ+qD0uw@mail.gmail.com

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v4-0006-Add-new-pg_walsummary-tool.patch application/octet-stream 11.4 KB
v4-0001-Refactor-parse_filename_for_nontemp_relation-to-p.patch application/octet-stream 11.9 KB
v4-0003-Change-how-a-base-backup-decides-which-files-have.patch application/octet-stream 11.4 KB
v4-0004-Move-src-bin-pg_verifybackup-parse_manifest.c-int.patch application/octet-stream 4.3 KB
v4-0005-Prototype-patch-for-incremental-and-differential-.patch application/octet-stream 329.6 KB
v4-0002-Change-struct-tablespaceinfo-s-oid-member-from-ch.patch application/octet-stream 11.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2023-10-04 20:15:03 Re: Pre-proposal: unicode normalized text
Previous Message Jim Jones 2023-10-04 20:03:38 Add annotation syntax to pg_hba.conf entries