Re: pg_combinebackup --copy-file-range

From: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_combinebackup --copy-file-range
Date: 2024-04-04 10:25:46
Message-ID: CAKZiRmyVsEpV=KDGz-x8FhnGVE4oS8pNmn2=vJ==S+aOhoaaUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 4, 2024 at 12:56 AM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
> Hi,
>
> Here's a much more polished and cleaned up version of the patches,
> fixing all the issues I've been aware of, and with various parts merged
> into much more cohesive parts (instead of keeping them separate to make
> the changes/evolution more obvious).

OK, so three runs of incrementalbackupstests - as stated earlier -
also passed with OK for v20240403 (his time even with
--enable-casserts)

pg_combinebackup flags tested were:
1) --copy-file-range --manifest-checksums=CRC32C
2) --copy-file-range --manifest-checksums=NONE
3) default, no flags (no copy-file-range)

> I changed how I think about this a bit - I don't really see the CoW copy
> methods as necessary faster than the regular copy (even though it can be
> with (5)). I think the main benefits are the space savings, enabled by
> patches (1)-(3). If (4) and (5) get it, that's a bonus, but even without
> that I don't think the performance is an issue - everything has a cost.

I take i differently: incremental backups without CoW fs would be clearly :
- inefficient in terms of RTO (SANs are often a key thing due to
having fast ability to "restore" the clone rather than copying the
data from somewhere else)
- pg_basebackup without that would be unusuable without space savings
(e.g. imagine daily backups @ 10+TB DWHs)

> On 4/3/24 15:39, Jakub Wartak wrote:
> > On Mon, Apr 1, 2024 at 9:46 PM Tomas Vondra
> > <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
[..]
> Thanks. Would be great if you could run this on the attached version of
> the patches, ideally for each of them independently, so make sure it
> doesn't get broken+fixed somewhere on the way.

Those are semi-manual test runs (~30 min? per run), the above results
are for all of them applied at once. So my take is all of them work
each one does individually too.

FWIW, I'm also testing your other offlist incremental backup
corruption issue, but that doesnt seem to be related in any way to
copy_file_range() patches here.

> >> 2) prefetch
> >> -----------
> > [..]
> >> I think this means we may need a "--prefetch" option, that'd force
> >> prefetching, probably both before pread and copy_file_range. Otherwise
> >> people on ZFS are doomed and will have poor performance.
> >
> > Right, we could optionally cover in the docs later-on various options
> > to get the performance (on XFS use $this, but without $that and so
> > on). It's kind of madness dealing with all those performance
> > variations.
> >
>
> Yeah, something like that. I'm not sure we want to talk too much about
> individual filesystems in our docs, because those things evolve over
> time too.

Sounds like Wiki then.

BTW, after a quick review: could we in 05 have something like common
value then (to keep those together via some .h?)

#define BATCH_SIZE PREFETCH_TARGET ?

-J.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema-Nio 2024-04-04 10:43:29 Re: psql not responding to SIGINT upon db reconnection
Previous Message Michael Paquier 2024-04-04 10:14:47 Re: Autogenerate some wait events code and documentation