Re: Really bad diskio

From: Ron Wills <ron(at)rwsoft(dot)ca>
To: "Jeffrey W(dot) Baker" <jwbaker(at)acm(dot)org>
Cc: Ron Wills <ron(at)rwsoft(dot)ca>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Really bad diskio
Date: 2005-07-15 22:11:39
Message-ID: 87zmsnafdw.wl%ron@rwsoft.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

At Fri, 15 Jul 2005 14:53:26 -0700,
Jeffrey W. Baker wrote:
>
> On Fri, 2005-07-15 at 15:29 -0600, Ron Wills wrote:
> > Here's a bit of a dump of the system that should be useful.
> >
> > Processors x2:
> >
> > vendor_id : AuthenticAMD
> > cpu family : 6
> > model : 8
> > model name : AMD Athlon(tm) MP 2400+
> > stepping : 1
> > cpu MHz : 2000.474
> > cache size : 256 KB
> >
> > MemTotal: 903804 kB
> >
> > Mandrake 10.0 Linux kernel 2.6.3-19mdk
> >
> > The raid controller, which is using the hardware raid configuration:
> >
> > 3ware 9000 Storage Controller device driver for Linux v2.26.02.001.
> > scsi0 : 3ware 9000 Storage Controller
> > 3w-9xxx: scsi0: Found a 3ware 9000 Storage Controller at 0xe8020000, IRQ: 17.
> > 3w-9xxx: scsi0: Firmware FE9X 2.02.00.011, BIOS BE9X 2.02.01.037, Ports: 4.
> > Vendor: 3ware Model: Logical Disk 00 Rev: 1.00
> > Type: Direct-Access ANSI SCSI revision: 00
> > SCSI device sda: 624955392 512-byte hdwr sectors (319977 MB)
> > SCSI device sda: drive cache: write back, no read (daft)
> >
> > This is also on a 3.6 reiser filesystem.
> >
> > Here's the iostat for 10mins every 10secs. I've removed the stats from
> > the idle drives to reduce the size of this email.
> >
> > Linux 2.6.3-19mdksmp (photo_server) 07/15/2005
> >
> > avg-cpu: %user %nice %sys %iowait %idle
> > 2.85 1.53 2.15 39.52 53.95
> >
> > Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
> > sda 82.49 4501.73 188.38 1818836580 76110154
> >
> > avg-cpu: %user %nice %sys %iowait %idle
> > 0.30 0.00 1.00 96.30 2.40
> >
> > Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
> > sda 87.80 6159.20 340.00 61592 3400
>
> These I/O numbers are not so horrible, really. 100% iowait is not
> necessarily a symptom of misconfiguration. It just means you are disk
> limited. With a database 20 times larger than main memory, this is no
> surprise.
>
> If I had to speculate about the best way to improve your performance, I
> would say:
>
> 1a) Get a better RAID controller. The 3ware hardware RAID5 is very bad.
> 1b) Get more disks.
> 2) Get a (much) newer kernel.
> 3) Try XFS or JFS. Reiser3 has never looked good in my pgbench runs

Not good news :(. I can't change the hardware, hopefully a kernel
update and XFS of JFS will make an improvement. I was hoping for
software raid (always has worked well), but the client didn't feel
conforable with it :P.

> By the way, are you experiencing bad application performance, or are you
> just unhappy with the iostat figures?

It's affecting the whole system. It is sending the load averages
through the roof (from 4 to 12) and processes that would take only a
few minutes starts going over an hour, until it clears up. Well, I
guess I'll have to drum up some more programming magic... and I'm
starting to run out of tricks... I love my job some day :$

> Regards,
> jwb
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message John A Meinel 2005-07-16 00:06:30 Re: slow joining very large table to smaller ones
Previous Message Jeffrey W. Baker 2005-07-15 21:53:26 Re: Really bad diskio