Re: H800 + md1200 Performance problem

From: Tomas Vondra <tv(at)fuzzy(dot)cz>
To:
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: H800 + md1200 Performance problem
Date: 2012-04-05 15:49:56
Message-ID: 4F7DBF24.5010402@fuzzy.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 5.4.2012 17:17, Cesar Martin wrote:
> Well, I have installed megacli on server and attach the results in file
> megacli.txt. Also we have "Dell Open Manage" install in server, that can
> generate a log of H800. I attach to mail with name lsi_0403.
>
> About dirty limits, I have default values:
> vm.dirty_background_ratio = 10
> vm.dirty_ratio = 20
>
> I have compared with other server and values are the same, except in
> centos 5.4 database production server that have vm.dirty_ratio = 40

Do the other machines have the same amount of RAM? The point is that the
values that work with less memory don't work that well with large
amounts of memory (and the amount of RAM did grow a lot recently).

For example a few years ago the average amount of RAM was ~8GB. In that
case the

vm.dirty_background_ratio = 10 => 800MB
vm.dirty_ratio = 20 => 1600MB

which is all peachy if you have a decent controller with a write cache.
But turn that to 64GB and suddenly

vm.dirty_background_ratio = 10 => 6.4GB
vm.dirty_ratio = 20 => 12.8GB

The problem is that there'll be a lot of data waiting (for 30 seconds by
default), and then suddenly it starts writing all of them to the
controller. Such systems behave just as your system - short strokes of
writes interleaved with 'no activity'.

Greg Smith wrote a nice howto about this - it's from 2007 but all the
recommendations are still valid:

http://www.westnet.com/~gsmith/content/linux-pdflush.htm

TL;DR:

- decrease the dirty_background_ratio/dirty_ratio (or use *_bytes)

- consider decreasing the dirty_expire_centiseconds

T.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Kim Hansen 2012-04-05 16:01:16 Re: Planner selects slow "Bitmap Heap Scan" when "Index Scan" is faster
Previous Message Kevin Grittner 2012-04-05 15:41:57 Re: bad planning with 75% effective_cache_size