Re: Add notes to pg_combinebackup docs

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Martín Marqués <martin(dot)marques(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add notes to pg_combinebackup docs
Date: 2024-04-09 09:44:57
Message-ID: d923aa8a-b753-48dc-ab0a-5e45d932d694@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4/9/24 09:59, Martín Marqués wrote:
> Hello,
>
> While doing some work/research on the new incremental backup feature
> some limitations were not listed in the docs. Mainly the fact that
> pg_combienbackup works with plain format and not tar.
>

Right. The docs mostly imply this by talking about output directory and
backup directories, but making it more explicit would not hurt.

FWIW it'd be great if we could make incremental backups work with tar
format in the future too. People probably don't want to keep around the
expanded data directory or extract everything before combining the
backups is not very convenient. Reading and writing the tar would make
this simpler.

> Around the same time, Tomas Vondra tested incremental backups with a
> cluster where he enabled checksums after taking the previous full
> backup. After combining the backups the synthetic backup had pages
> with checksums and other pages without checksums which ended in
> checksum errors.
>

I'm not sure just documenting this limitation is sufficient. We can't
make the incremental backups work in this case (it's as if someone
messes with cluster without writing stuff into WAL), but I think we
should do better than silently producing (seemingly) corrupted backups.

I say seemingly, because the backup is actually fine, the only problem
is it has checksums enabled in the controlfile, but the pages from the
full backup (and the early incremental backups) have no checksums.

What we could do is detect this in pg_combinebackup, and either just
disable checksums with a warning and hint to maybe enable them again. Or
maybe just print that the user needs to disable them.

I was thinking maybe we could detect this while taking the backups, and
force taking a full backup if checksums got enabled since the last
backup. But we can't do that because we only have the manifest from the
last backup, and the manifest does not include info about checksums.

It's a bit unfortunate we don't have a way to enable checksums online.
That'd not have this problem IIRC, because it writes proper WAL. Maybe
it's time to revive that idea ... I recall there were some concerns
about tracking progress to allow resuming stuff, but maybe not having
anything because in some (rare?) cases it'd need to do more work does
not seem like a great trade off.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2024-04-09 09:54:41 Incorrect handling of IS [NOT] NULL quals on inheritance parents
Previous Message Matthias van de Meent 2024-04-09 09:25:06 Re: PostgreSQL 17 Release Management Team & Feature Freeze