Quick Links

pgsql: Increase distance between flush requests during bulk file copies

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pgsql-committers(at)postgresql(dot)org
Subject:	pgsql: Increase distance between flush requests during bulk file copies
Date:	2017-10-08 19:25:45
Message-ID:	E1e1HCr-0008EB-7h@gemulon.postgresql.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-committers

Increase distance between flush requests during bulk file copies.

copy_file() reads and writes data 64KB at a time (with default BLCKSZ),
and historically has issued a pg_flush_data request after each write.
This turns out to interact really badly with macOS's new APFS file
system: a large file copy takes over 100X longer than it ought to on
APFS, as reported by Brent Dearth. While that's arguably a macOS bug,
it's not clear whether Apple will do anything about it in the near
future, and in any case experimentation suggests that issuing flushes
a bit less often can be helpful on other platforms too.

Hence, rearrange the logic in copy_file() so that flush requests are
issued once per N writes rather than every time through the loop.
I set the FLUSH_DISTANCE to 32MB on macOS (any less than that still
results in a noticeable speed degradation on APFS), but 1MB elsewhere.
In limited testing on Linux and FreeBSD, this seems slightly faster
than the previous code, and certainly no worse. It helps noticeably
on macOS even with the older HFS filesystem.

A simpler change would have been to just increase the size of the
copy buffer without changing the loop logic, but that seems likely
to trash the processor cache without really helping much.

Back-patch to 9.6 where we introduced msync() as an implementation
option for pg_flush_data(). The problem seems specific to APFS's
mmap/msync support, so I don't think we need to go further back.

Discussion: https://postgr.es/m/CADkxhTNv-j2jw2g8H57deMeAbfRgYBoLmVuXkC=YCFBXRuCOww@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/643c27e36ff38f40d256c2a05b51a14ae2b26077

Modified Files
--------------
src/backend/storage/file/copydir.c | 38 ++++++++++++++++++++++++++++++--------
1 file changed, 30 insertions(+), 8 deletions(-)

Browse pgsql-committers by date

	From	Date	Subject
Next Message	Andres Freund	2017-10-08 22:09:48	pgsql: Reduce memory usage of targetlist SRFs.
Previous Message	Peter Geoghegan	2017-10-08 17:29:06	Re: [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple