Re: extremely slow disk access (using SCSI, RAID)

From: Curt Sampson <cjs(at)cynic(dot)net>
To: Peter Galantha <argosz(at)fw(dot)hu>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: extremely slow disk access (using SCSI, RAID)
Date: 2002-06-26 15:00:56
Message-ID: Pine.NEB.4.43.0206262353190.1069-100000@angelic.cynic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 26 Jun 2002, Peter Galantha wrote:

> we are running 7.1.3 on a linux machine. By running simple 'group by'
> queries on large tables we are we are experiencing __extremely slow__
> disk read rates.
> The size of the database is cca. 40GB+. Size of a table cca. 1GB
> ...
> Storage - RAID 5, 3 x IBM SCSI 36GB (10,000rpm)
> Raid controller - Adaptec 3210
>
> select field, count(*) from table where date between '2001-1-1' and
> '2002-1-1' group by field order by count;

So are the rows selected by this fairly randomly distributed
throughout the table? How many rows are selected?

> procs memory swap io system
> 2 0 0 24744 1043928 47680 840760 0 0 1744 0 322 425
...

Well, if it's doing completely random reads, 1744 reads per second
from only three disks is pretty darn good, I'd say.

> Sometimes the io-bi is cca. 3-5000/sec, but it's still way too low in
> my mind.
> What is the best rate we can achieve with this RAID 5 configuration
> and how?

It depends on whether you're doing sequential or random reads.
Typically you can expect to get 75-150 random reads per second from
a single disk, maybe even up to 300 if it's a super-fast modern
one. (I get about 120/sec from a good 7200 RPM IDE drive when
reading from a 25 GB or so range on it.)

It may seem funny, because you're seeing only a few MB/sec come
from a disk that can do tends of MB/sec when reading sequentially,
but you're spending most of your time seeking, not reading.

So the solution is to add more disk arms.

If you want to test to see if this is really the problem, BTW, use
the CLUSTER command to cluster the table on the date column, and
try the query again. If it suddenly gets way, way faster, it's
random reads that are killing you.

BTW, there's a little benchmark to do quickie tests of random read
performance at http://randread.sourceforge.net .

> As a comparison I quoted the output of vmstat when we read a simple
> file from the disk. (Still not very fast, but faster)
>
> argosz(at)it:~$ cat /var/log/apache/default-access-log.log > /dev/null
>
> procs memory swap io system
> r b w swpd free buff cache si so bi bo in cs
> 1 0 0 24636 181732 49324 1650952 0 0 28700 0 648 1077
...

Yeah, well, sequential reads, eh?

Oh, and normally RAID-5 shoudn't be too much slower when reading.
Writing is another matter...

cjs
--
Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Martijn van Oosterhout 2002-06-26 15:05:34 Re: Still problems with memory swapping and server load
Previous Message Tom Lane 2002-06-26 14:59:06 Re: Still problems with memory swapping and server load