Re: pg_upgrade --copy-file-range

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>
Cc: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_upgrade --copy-file-range
Date: 2024-03-19 15:22:46
Message-ID: 3024283a-7491-4240-80d0-421575f6bb23@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I took a quick look at the remaining part adding copy_file_range to
pg_combinebackup. The patch no longer applies, so I had to rebase it.
Most of the issues were trivial, but I had to fix a couple missing
prototypes - I added them to copy_file.h/c, mostly.

0001 is the minimal rebase + those fixes

0002 has a couple review comments in copy_file, and it also undoes a lot
of unnecessary formatting changes (already pointed out by Peter a couple
days ago).

A couple review comments:

1) AFAIK opt_errinfo() returns pointer to the local "buf" variable.

2) I wonder if we even need opt_errinfo(). I'm not sure it actually
makes anything simpler.

3) I think it'd be nice to make CopyFileMethod more consistent with
transferMode in pg_upgrade.h (I mean, it seems wise to make the naming
more consistent, it's probably not worth unifying this somehow).

4) I wonder how we came up with copying the files by 50 blocks, but I
now realize it's been like this before this patch. I only noticed
because the patch adds a comment before buffer_size calculation.

5) I dislike the renaming of copy_file_blocks to pg_copyfile. The new
name is way more generic / less descriptive - it's clear it copies the
file block by block (well, in chunks). pg_copyfile is pretty vague.

6) This leaves behind copy_file_copyfile, which is now unused.

7) The patch reworks how combinebackup deals with alternative copy
implementations - instead of setting strategy_implementation and calling
that, the decisions now happen in pg_copyfile_offload with a lot of
conditions / ifdef / defined ... I find it pretty hard to understand and
reason about. I liked the strategy_implementation approach, as it forces
us to keep each method in a separate function.

Perhaps there's a reason why that doesn't work for copy_file_range? But
in that case this needs much clearer comments.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
v20240319-0002-review-and-cleanup.patch text/x-patch 29.5 KB
v20240319-0001-rebased-patch.patch text/x-patch 40.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2024-03-19 15:24:19 Re: speed up a logical replica setup
Previous Message Alexander Korotkov 2024-03-19 15:20:18 Re: POC: Lock updated tuples in tuple_update() and tuple_delete()