Skip site navigation (1) Skip section navigation (2)

Re: Raid 10 chunksize

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Raid 10 chunksize
Date: 2009-03-25 08:07:50
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
On Wed, 25 Mar 2009, Mark Kirkwood wrote:

> I'm thinking that the raid chunksize may well be the issue.

Why?  I'm not saying you're wrong, I just don't see why that parameter 
jumped out as a likely cause here.

> Sun 4140 2x quad-core opteron 2356 16G RAM,  6x 15K 140G SAS

That server doesn't have any sort of write cache on it, right?  That means 
that all the fsync's done near checkpoint time are going to thrash your 
disks around.  One thing you can do to improve that situation is push 
checkpoint_segments up to the maximum you can possibly stand.  You could 
consider double or even quadruple what you're using right now, the 
recovery time after a crash will spike upwards a bunch though.  That will 
minimize the number of checkpoints and reduce the average disk I/O they 
produce per unit of time, due to how they're spread out in 8.3.  You might 
bump upwards checkpoint_completion_target to 0.9 in order to get some 
improvement without increasing recovery time as badly.

Also, if you want to minimize total I/O, you might drop 
bgwriter_lru_maxpages to 0.  That feature presumes you have some spare I/O 
capacity you use to prioritize lower latency, and it sounds like you 
don't.  You get the lowest total I/O per transaction with the background 
writer turned off.

You happened to catch me on a night where I was running some pgbench tests 
here, so I can give you something similar to compare against.  Quad-core 
system, 8GB of RAM, write-caching controller with 3-disk RAID0 for 
database and 1 disk for WAL; Linux software RAID though.  Here's the same 
data you collected at the same scale you're testing, with similar 
postgresql.conf settings too (same shared_buffers and 
checkpoint_segments, I didn't touch any of the vacuum parameters):

number of clients: 32
number of transactions per client: 6250
number of transactions actually processed: 200000/200000
tps = 1097.933319 (including connections establishing)
tps = 1098.372510 (excluding connections establishing)

Cpu(s):  3.6%us,  1.0%sy,  0.0%ni, 57.2%id, 37.5%wa,  0.0%hi,  0.7%si,  0.0%st
Mem:   8174288k total,  5545396k used,  2628892k free,   473248k buffers
Swap:        0k total,        0k used,        0k free,  4050736k cached

sda,b,d are the database, sdc is the WAL, here's a couple of busy periods:

Device:         rrqm/s   wrqm/s   r/s    w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00   337.26  0.00 380.72     0.00     2.83    15.24   104.98  278.77   2.46  93.55
sdb               0.00   343.56  0.00 386.31     0.00     2.86    15.17    91.32  236.61   2.46  94.95
sdd               0.00   342.86  0.00 391.71     0.00     2.92    15.28   128.36  342.42   2.43  95.14
sdc               0.00   808.89  0.00  45.45     0.00     3.35   150.72     1.22   26.75  21.13  96.02

Device:         rrqm/s   wrqm/s   r/s   w/s     rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00   377.82  0.00 423.38     0.00     3.13    15.12    74.24  175.21   1.41  59.58
sdb               0.00   371.73  0.00 423.18     0.00     3.13    15.15    50.61  119.81   1.41  59.58
sdd               0.00   372.93  0.00 414.99     0.00     3.06    15.09    60.02  144.32   1.44  59.70
sdc               0.00  3242.16  0.00 258.84     0.00    13.68   108.23     0.88    3.42   2.96  76.60

They don't really look much different from yours.  I'm using software RAID 
and haven't touched any of its parameters; didn't even use noatime on the 
ext3 filesystems (you should though--that's one of those things the write 
cache really helps out with in my case).

* Greg Smith gsmith(at)gregsmith(dot)com Baltimore, MD

In response to


pgsql-performance by date

Next:From: Jerry ChamplinDate: 2009-03-25 10:05:40
Subject: Re: Raid 10 chunksize
Previous:From: Bryan MurphyDate: 2009-03-25 03:47:00
Subject: Re: Help Me Understand Why I'm Getting a Bad Query Plan

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group