Re: Horrible CREATE DATABASE Performance in High Sierra

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Brent Dearth <brent(dot)dearth(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Horrible CREATE DATABASE Performance in High Sierra
Date: 2017-10-02 20:23:38
Message-ID: 20171002202338.zcve6naobz3kf4rn@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-10-02 15:59:05 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2017-10-02 15:54:43 -0400, Tom Lane wrote:
> >> Should I expect there to be any difference at all? We don't enable
> >> *_flush_after by default on non-Linux platforms.
>
> > Right, you'd have to enable that. But your patch would neuter an
> > intentionally enabled config too, no?
>
> Well, if you want to suggest a specific scenario to try, I'm happy to.
> I am not going to guess as to what will satisfy you.

To demonstrate what I'm observing here, on linux with a fairly fast ssd:

with:
-c autovacuum_analyze_threshold=2147483647 # to avoid analyze snapshot issue
-c fsync=on
-c synchronous_commit=on
-c shared_buffers=4GB
-c max_wal_size=30GB
-c checkpoint_timeout=30s
-c checkpoint_flush_after=0
-c bgwriter_flush_after=0
and
pgbench -i -s 100 -q

a pgbench -M prepared -c 8 -j 8 -n -P1 -T 100
often has periods like:

synchronous_commit = on:
progress: 73.0 s, 395.0 tps, lat 20.029 ms stddev 4.001
progress: 74.0 s, 289.0 tps, lat 23.730 ms stddev 23.337
progress: 75.0 s, 88.0 tps, lat 104.029 ms stddev 178.038
progress: 76.0 s, 400.0 tps, lat 20.055 ms stddev 4.844
latency average = 21.599 ms
latency stddev = 13.865 ms
tps = 370.346506 (including connections establishing)
tps = 370.372550 (excluding connections establishing)

with synchronous_commit=off those periods are a lot worse:
progress: 57.0 s, 21104.3 tps, lat 0.379 ms stddev 0.193
progress: 58.0 s, 9994.1 tps, lat 0.536 ms stddev 3.140
progress: 59.0 s, 0.0 tps, lat -nan ms stddev -nan
progress: 60.0 s, 0.0 tps, lat -nan ms stddev -nan
progress: 61.0 s, 0.0 tps, lat -nan ms stddev -nan
progress: 62.0 s, 0.0 tps, lat -nan ms stddev -nan
progress: 63.0 s, 3319.6 tps, lat 12.860 ms stddev 253.664
progress: 64.0 s, 20997.0 tps, lat 0.381 ms stddev 0.190
progress: 65.0 s, 20409.1 tps, lat 0.392 ms stddev 0.303
...
latency average = 0.745 ms
latency stddev = 20.470 ms
tps = 10743.555553 (including connections establishing)
tps = 10743.815591 (excluding connections establishing)

contrasting that to checkpoint_flush_after=256kB and
bgwriter_flush_after=512kB:

synchronous_commit=on
worst:
progress: 87.0 s, 298.0 tps, lat 26.874 ms stddev 26.691

latency average = 21.898 ms
latency stddev = 6.416 ms
tps = 365.308180 (including connections establishing)
tps = 365.318793 (excluding connections establishing)

synchronous_commit=on

worst:

progress: 30.0 s, 7026.8 tps, lat 1.137 ms stddev 11.070

latency average = 0.550 ms
latency stddev = 5.599 ms
tps = 14547.842213 (including connections establishing)
tps = 14548.325102 (excluding connections establishing)

If you do the same on rotational disks, the stall periods can get a
*lot* worse (multi-minute stalls with pretty much no activity).

What I'm basically wondering is whether we're screwing somebody over
that made the effort to manually configure this on OSX. It's fairly
obvious we need to find a way to disable the msync() by default.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gavin Flower 2017-10-02 20:42:42 Re: 64-bit queryId?
Previous Message Tom Lane 2017-10-02 19:59:05 Re: Horrible CREATE DATABASE Performance in High Sierra