Re: checkpointer continuous flushing

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: checkpointer continuous flushing
Date: 2015-06-22 08:11:09
Message-ID: alpine.DEB.2.10.1506221008500.23011@sto
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


<sorry, resent stalled post, wrong from>

> It'd be interesting to see numbers for tiny, without the overly small
> checkpoint timeout value. 30s is below the OS's writeback time.

Here are some tests with longer timeout:

tiny2: scale=10 shared_buffers=1GB checkpoint_timeout=5min
max_wal_size=1GB warmup=600 time=4000

flsh | full speed tps | percent of late tx, 4 clients, for tps:
/srt | 1 client | 4 clients | 100 | 200 | 400 | 800 | 1200 | 1600
N/N | 930 +- 124 | 2560 +- 394 | 0.70 | 1.03 | 1.27 | 1.56 | 2.02 | 2.38
N/Y | 924 +- 122 | 2612 +- 326 | 0.63 | 0.79 | 0.94 | 1.15 | 1.45 | 1.67
Y/N | 907 +- 112 | 2590 +- 315 | 0.58 | 0.83 | 0.68 | 0.71 | 0.81 | 1.26
Y/Y | 915 +- 114 | 2590 +- 317 | 0.60 | 0.68 | 0.70 | 0.78 | 0.88 | 1.13

There seems to be a small 1-2% performance benefit with 4 clients, this is
reversed for 1 client, there are significantly and consistently less late
transactions when options are activated, the performance is more stable
(standard deviation reduced by 10-18%).

The db is about 200 MB ~ 25000 pages, at 2500+ tps it is written 40 times
over in 5 minutes, so the checkpoint basically writes everything in 220
seconds, 0.9 MB/s. Given the preload phase the buffers may be more or less
in order in memory, so may be written out in order anyway.

medium2: scale=300 shared_buffers=5GB checkpoint_timeout=30min
max_wal_size=4GB warmup=1200 time=7500

flsh | full speed tps | percent of late tx, 4 clients
/srt | 1 client | 4 clients | 100 | 200 | 400 |
N/N | 173 +- 289* | 198 +- 531* | 27.61 | 43.92 | 61.16 |
N/Y | 458 +- 327* | 743 +- 920* | 7.05 | 14.24 | 24.07 |
Y/N | 169 +- 166* | 187 +- 302* | 4.01 | 39.84 | 65.70 |
Y/Y | 546 +- 143 | 681 +- 459 | 1.55 | 3.51 | 2.84 |

The effect of sorting is very positive (+150% to 270% tps). On this run,
flushing has a positive (+20% with 1 client) or negative (-8 % with 4
clients) on throughput, and late transactions are reduced by 92-95% when
both options are activated.

At 550 tps checkpoints are xlog-triggered and write about 1/3 of the
database, (170000 buffers to write very 220-260 seconds, 4 MB/s).

--
Fabien.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2015-06-22 09:20:41 Re: user space function "is_power_user"
Previous Message Fabien COELHO 2015-06-22 08:01:26 Re: checkpointer continuous flushing