Re: disk I/O problems and Solutions

From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Alan McKay <alan(dot)mckay(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: disk I/O problems and Solutions
Date: 2009-10-09 19:22:57
Message-ID: dcc563d10910091222h16fb0c50q402fc2ceac17c512@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, Oct 9, 2009 at 10:45 AM, Alan McKay <alan(dot)mckay(at)gmail(dot)com> wrote:
> Hey folks,
>
> CentOS / PostgreSQL shop over here.
>
> I'm hitting 3 of my favorite lists with this, so here's hoping that
> the BCC trick is the right way to do it :-)

I added pgsql-performance back in in my reply so we can share with the
rest of the class.

> We've just discovered thanks to a new Munin plugin
> http://blogs.amd.co.at/robe/2008/12/graphing-linux-disk-io-statistics-with-munin.html
> that our production DB is completely maxing out in I/O for about a 3
> hour stretch from 6am til 9am
> This is "device utilization" as per the last graph at the above link.

What does vmstat, sar, or top have to say about it? If you're at 100%
IO Wait, then yeah, your disk subsystem is your bottleneck.

> Our system
> IBM 3650 - quad 2Ghz e5405 Xeon
> 8K SAS RAID Controller

Does this RAID controller have a battery backed cache on it?

> 6 x 300G 15K/RPM SAS Drives
> /dev/sda - 2 drives configured as a RAID 1 for 300G for the OS
> /dev/sdb - 3 drives configured as RAID5 for 600G for the DB
> 1 drive as a global hot spare
>
> /dev/sdb is the one that is maxing out.

Yeah, with RAID-5 that's not surprising. Especially if you've got
even a decent / small percentage of writes in the mix, RAID-5 is gonna
be pretty slow.

> We need to have a very serious look at fixing this situation.   But we
> don't have the money to be experimenting with solutions that won't
> solve our problem.  And our budget is fairly limited.
>
> Is there a public library somewhere of disk subsystems and their
> performance figures?  Done with some semblance of a standard
> benchmark?

Not that I know of, and if there is, I'm as eager as you to find it.

This mailing list's archives are as close as I've come to finding it.

> One benchmark I am partial to is this one :
> http://wiki.postgresql.org/wiki/PgCon_2009/Greg_Smith_Hardware_Benchmarking_notes#dd_test
>
> One thing I am thinking of in the immediate term is taking the RAID5 +
> hot spare and converting it to RAID10 with the same amount of storage.
>  Will that perform much better?

Almost certainly.

> In general we are planning to move away from RAID5 toward RAID10.
>
> We also have on order an external IBM array (don't have the exact name
> on hand but model number was 3000) with 12 drive bays.  We ordered it
> with just 4 x SATAII drives, and were going to put it on a different
> system as a RAID10.  These are just 7200 RPM drives - the goal was
> cheaper storage because the SAS drives are about twice as much per
> drive, and it is only a 300G drive versus the 1T SATA2 drives.   IIRC
> the SATA2 drives are about $200 each and the SAS 300G drives about
> $500 each.

> So I have 2 thoughts with this 12 disk array.   1 is to fill it up
> with 12 x cheap SATA2 drives and hope that even though the spin-rate
> is a lot slower, that the fact that it has more drives will make it
> perform better.  But somehow I am doubtful about that.   The other
> thought is to bite the bullet and fill it up with 300G SAS drives.

I'd give the SATA drives a try. If they aren't fast enough, then
everybody in the office gets a free / cheap drive upgrade in their
desktop machine. More drives == faster RAID-10 up to the point you
saturate your controller / IO bus on your machine

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Merlin Moncure 2009-10-09 21:02:20 Re: Databases vs Schemas
Previous Message Flavio Henrique Araque Gurgel 2009-10-09 19:03:42 Re: disk I/O problems and Solutions