Re: pg_upgrade --copy-file-range

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_upgrade --copy-file-range
Date: 2024-03-23 13:37:14
Message-ID: 375f5b9b-a4f2-485d-bb8a-da2e8e8a0ecd@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/22/24 19:40, Robert Haas wrote:
> On Fri, Mar 22, 2024 at 1:22 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>> Right, this will happen:
>>
>> pg_combinebackup: error: unable to use accelerated copy when manifest
>> checksums are to be calculated. Use --no-manifest
>>
>> Are you saying we should just silently override the copy method and do
>> the copy block by block?
>
> Yes.
>
>> I'm not strongly opposed to that, but it feels
>> wrong to just ignore that the user explicitly requested cloning, and I'm
>> not sure why should this be different from any other case when the user
>> requests incompatible combination of options and/or options that are not
>> supported on the current configuration.
>
> I don't feel like copying block-by-block when that's needed to compute
> a checksum is ignoring what the user requested. I mean, if we'd had to
> perform reconstruction rather than copying an entire file, we would
> have done that regardless of whether --clone had been there, and I
> don't see why the need-to-compute-a-checksum case is any different. I
> think we should view a flag like --clone as specifying how to copy a
> file when we don't need to do anything but copy it. I don't think it
> should dictate that we're not allowed to perform other processing when
> that other processing is required.
>
> From my point of view, this is not a case of incompatible options
> having been specified. If you specify run pg_basebackup with both
> --format=p and --format=t, those are incompatible options; the backup
> can be done one way or the other, but not both at once. But here
> there's no such conflict. Block-by-block copying and fast-copying can
> happen as part of the same operation, as in the example that I showed,
> where some files need the block-by-block copying and some can be
> fast-copied. The user is entitled to specify which fast-copying method
> they would like to have used for the files where fast-copying is
> possible without getting a failure just because it isn't possible for
> every single file.
>
> Or to say it the other way around, if there's 1 file that needs to be
> copied block by block and 99 files that can be fast-copied, you want
> to force the user to the block-by-block method for all 100 files. I
> want to force the use of the block-by-block method for the 1 file
> where that's the only valid method, and let them choose what they want
> to do for the other 99.
>

OK, that makes sense. Here's a patch that should work like this - in
copy_file we check if we need to calculate checksums, and either use the
requested copy method, or fall back to the block-by-block copy.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
v20240323-0001-pg_combinebackup-allow-using-clone-copy_fi.patch text/x-patch 13.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2024-03-23 13:47:12 Re: pg_upgrade --copy-file-range
Previous Message Robert Haas 2024-03-23 12:55:30 Re: [PATCH] plpython function causes server panic