Re: H800 + md1200 Performance problem

From: Tomas Vondra <tv(at)fuzzy(dot)cz>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: H800 + md1200 Performance problem
Date: 2012-04-04 19:50:44
Message-ID: 4F7CA614.8060008@fuzzy.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 4.4.2012 20:46, Cesar Martin wrote:
> Raid controller issue or driver problem was the first problem that I
> studied.
> I installed Centos 5.4 al the beginning, but I had performance problems,
> and I contacted Dell support... but Centos is not support by Dell...
> Then I installed Redhat 6 and we contact Dell with same problem.
> Dell say that all is right and that this is a software problem.
> I have installed Centos 5.4, 6.2 and Redhat 6 with similar result, I
> think that not is driver problem (megasas-raid kernel module).
> I will check kernel updates...
> Thanks!

Well, there are different meanings of 'working'. Obviously you mean
'gives reasonable performance' while Dell understands 'is not on fire'.

IIRC H800 is just a 926x controller from LSI, so it's probably based on
LSI 2108. Can you post basic info about the setting, i.e.

MegaCli -AdpAllInfo -aALL

or something like that? I'm especially interested in the access/cache
policies, cache drop interval .etc, i.e.

MegaCli -LDGetProp (-Cache | -Access | -Name | -DskCache)

What I'd do next is testing a much smaller array (even a single drive)
to see if the issue exists. If it works, try to add another drive etc.
It's much easier to show them something's wrong. The simpler the test
case, the better.

I've found this (it's about a 2108-based controller from LSI):

http://www.xbitlabs.com/articles/storage/display/lsi-megaraid-sas9260-8i_3.html#sect0

The paragraphs below the diagram are interesting. Not sure if they
describe the same issue you have, but maybe it's related.

Anyway, it's quite usual that a RAID controller has about 50% write
performance compared to read performance, usually due to on-board CPU
bottleneck. You do have ~ 530 MB/s and 170 MB/s, so it's not exactly 50%
but it's not very far.

But the fluctuation, that surely is strange. What are the page cache
dirty limits, i.e.

cat /proc/sys/vm/dirty_background_ratio
cat /proc/sys/vm/dirty_ratio

That's probably #1 source I've seen responsible for such issues (on
machines with a lot of RAM).

Tomas

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Ofer Israeli 2012-04-05 09:39:59 Re: TCP Overhead on Local Loopback
Previous Message Merlin Moncure 2012-04-04 18:58:58 Re: H800 + md1200 Performance problem