Re: Horrible CREATE DATABASE Performance in High Sierra

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Brent Dearth <brent(dot)dearth(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Horrible CREATE DATABASE Performance in High Sierra
Date: 2017-10-02 18:23:35
Message-ID: 10269.1506968615@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Brent Dearth <brent(dot)dearth(at)gmail(dot)com> writes:
> I just recently "upgraded" to High Sierra and experiencing horrendous CREATE
> DATABASE performance. Creating a database from a 3G template DB used to
> take ~1m but post-upgrade is taking ~22m at a sustained write of around
> 4MB/s. Occasionally, attempting to create an empty database hangs
> indefinitely as well. When this happens, restarting the Postgres server
> allows empty database initialization in ~1s.

What PG version are you running?

I tried to reproduce this, using HEAD and what I had handy:
(a) Late 2016 MacBook Pro, 2.7GHz i7, still on Sierra
(b) Late 2013 MacBook Pro, 2.3GHz i7, High Sierra, drive is converted to APFS

I made a ~7.5GB test database using "pgbench -i -s 500 bench" and
then cloned it with "create database b2 with template bench".

Case 1: fsync off.
Machine A did the clone in 5.6 seconds, machine B in 12.9 seconds.

Considering the CPU speed difference and the fact that Apple put
significantly faster SSDs into the 2016 models, I'm not sure this
difference is due to anything but better hardware.

Case 2: fsync on.
Machine A did the clone in 7.5 seconds, machine B in 2523.5 sec (42 min!).

So something is badly busted in APFS' handling of fsync, and/or
we're doing it in a bad way.

Interestingly, pg_test_fsync shows only about a factor-of-2 difference
in the timings for regular file fsyncs. So I poked into non-fsync
logic that we'd added recently, and after awhile found that diking out
the msync code path in pg_flush_data reduces machine B's time to an
entirely reasonable 11.5 seconds.

In short, therefore, APFS cannot cope with the way we're using msync().
I observe that the copy gets slower and slower as it runs (watching the
transfer rate with "iostat 1"), which makes me wonder if there's some
sort of O(N^2) issue in the kernel logic for this. But anyway, as
a short-term workaround you might try

diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index b0c174284b..af35de5f7d 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -451,7 +451,7 @@ pg_flush_data(int fd, off_t offset, off_t nbytes)
return;
}
#endif
-#if !defined(WIN32) && defined(MS_ASYNC)
+#if 0
{
void *p;
static int pagesize = 0;

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-10-02 18:30:38 Re: generated columns
Previous Message Stephen Frost 2017-10-02 18:12:50 Re: list of credits for release notes