Re: pg_combinebackup --copy-file-range

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_combinebackup --copy-file-range
Date: 2024-04-07 17:53:08
Message-ID: 52a726ec-b958-4ea5-9fea-c0b62af66ac4@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4/7/24 19:46, Tomas Vondra wrote:
> On 4/5/24 21:43, Tomas Vondra wrote:
>> Hi,
>>
>> ...
>>
>> 2) The prefetching is not a huge improvement, at least not for these
>> three filesystems (btrfs, ext4, xfs). From the color scale it might seem
>> like it helps, but those values are relative to the baseline, so when
>> the non-prefetching value is 5% and with prefetching 10%, that means the
>> prefetching makes it slower. And that's very often true.
>>
>> This is visible more clearly in prefetching.pdf, comparing the
>> non-prefetching and prefetching results for each patch, not to baseline.
>> That's makes it quite clear there's a lot of "red" where prefetching
>> makes it slower. It certainly does help for larger increments (which
>> makes sense, because the modified blocks are distributed randomly, and
>> thus come from random files, making long streaks unlikely).
>>
>> I've imagined the prefetching could be made a bit smarter to ignore the
>> streaks (=sequential patterns), but once again - this only matters with
>> the batching, which we don't have. And without the batching it looks
>> like a net loss (that's the first column in the prefetching PDF).
>>
>> I did start thinking about prefetching because of ZFS, where it was
>> necessary to get decent performance. And that's still true. But (a) even
>> with the explicit prefetching it's still 2-3x slower than any of these
>> filesystems, so I assume performance-sensitive use cases won't use it.
>> And (b) the prefetching seems necessary in all cases, no matter how
>> large the increment is. Which goes directly against the idea of looking
>> at how random the blocks are and prefetching only the sufficiently
>> random patterns. That doesn't seem like a great thing.
>>
>
> I finally got a more complete ZFS results, and I also decided to get
> some numbers without the ZFS tuning I did. And boy oh boy ...
>
> All the tests I did with ZFS were tuned the way I've seen recommended
> when using ZFS for PostgreSQL, that is
>
> zfs set recordsize=8K logbias=throughput compression=none
>
> and this performed quite poorly - pg_combinebackup took 4-8x longer than
> with the traditional filesystems (btrfs, xfs, ext4) and the only thing
> that improved that measurably was prefetching.
>
> But once I reverted back to the default recordsize of 128kB the
> performance is waaaaaay better - entirely comparable to ext4/xfs, while
> btrfs remains faster with --copy-file-range --no-manigest (by a factor
> of 2-3x).
>

I forgot to explicitly say that I think confirms the decision to not
push the patch adding the explicit prefetching to pg_combinebackup. It's
not needed/beneficial even for ZFS, when using a suitable configuration.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2024-04-07 18:00:24 Re: Streaming I/O, vectored I/O (WIP)
Previous Message Tomas Vondra 2024-04-07 17:46:59 Re: pg_combinebackup --copy-file-range