Re: pg_upgrade --copy-file-range

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_upgrade --copy-file-range
Date: 2023-07-03 07:47:05
Message-ID: e191da9a-0966-841c-8842-14aa1128957b@eisentraut.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02.06.23 21:30, Thomas Munro wrote:
> I was just in a pg_upgrade unconference session at PGCon where the
> lack of $SUBJECT came up. This system call gives the kernel the
> option to use fast block cloning on XFS, ZFS (as of very recently),
> etc, and works on Linux and FreeBSD. It's probably much the same as
> --clone mode on COW file systems, except that is Linux-only. On
> overwrite file systems (ie not copy-on-write, like ext4), it may also
> be able to push copies down to storage hardware/network file systems.
>
> There was something like this in the nearby large files patch set, but
> in that version it just magically did it when available in --copy
> mode. Now I think the user should have to have to opt in with
> --copy-file-range, and simply to error out if it fails. It may not
> work in some cases -- for example, the man page says that older Linux
> systems can fail with EXDEV when you try to copy across file systems,
> while newer systems will do something less efficient but still
> sensible internally; also I saw a claim that some older versions had
> weird bugs. Better to just expose the raw functionality and let users
> say when they want it and read the error if it fail, I think.

When we added --clone, copy_file_range() was available, but the problem
was that it was hard for the user to predict whether you'd get the fast
clone behavior or the slow copy behavior. That's the kind of thing you
want to know when planning and testing your upgrade. At the time, there
were patches passed around in Linux kernel circles that would have been
able to enforce cloning via the flags argument of copy_file_range(), but
that didn't make it to the mainline.

So, yes, being able to specify exactly which copy mechanism to use makes
sense, so that users can choose the tradeoffs.

About your patch:

I think you should have a "check" function called from
check_new_cluster(). That check function can then also handle the "not
supported" case, and you don't need to handle that in
parseCommandLine(). I suggest following the clone example for these,
since the issues there are very similar.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2023-07-03 07:54:16 Re: Performance degradation on concurrent COPY into a single relation in PG16.
Previous Message Heikki Linnakangas 2023-07-03 07:36:49 Re: Performance degradation on concurrent COPY into a single relation in PG16.