Re: pg_combinebackup does not detect missing files

From: David Steele <david(at)pgmasters(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_combinebackup does not detect missing files
Date: 2024-04-23 23:22:58
Message-ID: 908d3845-e6dd-43cb-82f3-56f11b57a98f@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4/22/24 23:53, Robert Haas wrote:
> On Sun, Apr 21, 2024 at 8:47 PM David Steele <david(at)pgmasters(dot)net> wrote:
>>> I figured that wouldn't be particularly meaningful, and
>>> that's pretty much the only kind of validation that's even
>>> theoretically possible without a bunch of extra overhead, since we
>>> compute checksums on entire files rather than, say, individual blocks.
>>> And you could really only do it for the final backup in the chain,
>>> because you should end up accessing all of those files, but the same
>>> is not true for the predecessor backups. So it's a very weak form of
>>> verification.
>>
>> I don't think it is weak if you can verify that the output is exactly as
>> expected, i.e. all files are present and have the correct contents.
>
> I don't understand what you mean here. I thought we were in agreement
> that verifying contents would cost a lot more. The verification that
> we can actually do without much cost can only check for missing files
> in the most recent backup, which is quite weak. pg_verifybackup is
> available if you want more comprehensive verification and you're
> willing to pay the cost of it.

I simply meant that it is *possible* to verify the output of
pg_combinebackup without explicitly verifying all the backups. There
would be overhead, yes, but it would be less than verifying each backup
individually. For my 2c that efficiency would make it worth doing
verification in pg_combinebackup, with perhaps a switch to turn it off
if the user is confident in their sources.

>> I think it is a worthwhile change and we are still a month away from
>> beta1. We'll see if anyone disagrees.
>
> I don't plan to press forward with this in this release unless we get
> a couple of +1s from disinterested parties. We're now two weeks after
> feature freeze and this is design behavior, not a bug. Perhaps the
> design should have been otherwise, but two weeks after feature freeze
> is not the time to debate that.

It doesn't appear that anyone but me is terribly concerned about
verification, even in this weak form, so probably best to hold this
patch until the next release. As you say, it is late in the game.

Regards,
-David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2024-04-23 23:43:12 Re: Requiring LLVM 14+ in PostgreSQL 18
Previous Message Tomas Vondra 2024-04-23 22:43:38 Re: BitmapHeapScan streaming read user and prelim refactoring