Re: 10K vs 15k rpm for analytics

From: Scott Carey <scott(at)richrelevance(dot)com>
To: "<david(at)lang(dot)hm>" <david(at)lang(dot)hm>
Cc: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, Francisco Reyes <lists(at)stringsutils(dot)com>, Pgsql performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: 10K vs 15k rpm for analytics
Date: 2010-03-09 00:01:21
Message-ID: 06A22EB5-2D84-401C-B75F-7CA883695E0A@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On Mar 2, 2010, at 2:10 PM, <david(at)lang(dot)hm> wrote:

> On Tue, 2 Mar 2010, Scott Marlowe wrote:
>
>> On Tue, Mar 2, 2010 at 2:30 PM, Francisco Reyes <lists(at)stringsutils(dot)com> wrote:
>>> Scott Marlowe writes:
>>>
>>>> Then the real thing to compare is the speed of the drives for
>>>> throughput not rpm.
>>>
>>> In a machine, simmilar to what I plan to buy, already in house 24 x 10K rpm
>>> gives me about 400MB/sec while 16 x 15K rpm (2 to 3 year old drives) gives
>>> me about 500MB/sec
>>
>> Have you tried short stroking the drives to see how they compare then?
>> Or is the reduced primary storage not a valid path here?
>>
>> While 16x15k older drives doing 500Meg seems only a little slow, the
>> 24x10k drives getting only 400MB/s seems way slow. I'd expect a
>> RAID-10 of those to read at somewhere in or just past the gig per
>> second range with a fast pcie (x8 or x16 or so) controller. You may
>> find that a faster controller with only 8 or so fast and large SATA
>> drives equals the 24 10k drives you're looking at now. I can write at
>> about 300 to 350 Megs a second on a slower Areca 12xx series
>> controller and 8 2TB Western Digital Green drives, which aren't even
>> made for speed.
>
> what filesystem is being used. There is a thread on the linux-kernel
> mailing list right now showing that ext4 seems to top out at ~360MB/sec
> while XFS is able to go to 500MB/sec+

I have Centos 5.4 with 10 7200RPM 1TB SAS drives in RAID 10 (Seagate ES.2, same perf as the SATA ones), XFS, Adaptec 5805, and get ~750MB/sec read and write sequential throughput.

A RAID 0 of two of these stops around 1000MB/sec because it is CPU bound in postgres -- for select count(*). If it is select * piped to /dev/null, it is CPU bound below 300MB/sec converting data to text.

For xfs, set readahead to 16MB or so (2MB or so per stripe) (--setra 32768 is 16MB) and absolutely make sure that the xfs mount parameter 'allocsize' is set to about the same size or more. For large sequential operations, you want to make sure interleaved writes don't interleave files on disk. I use 80MB allocsize, and 40MB readahead for the reporting data.

Later Linux kernels have significantly improved readahead systems that don't need to be tuned quite as much. For high sequential throughput, nothing is as optimized as XFS on Linux yet. It has weaknesses elsewhere however.

And 3Ware on Linux + high throughput sequential = slow. PERC 6 was 20% faster, and Adaptec was 70% faster with the same drives, and with experiments to filesystem and readahead for all. From what I hear, Areca is a significant notch above Adaptec on that too.

>
> on single disks the disk performance limits you, but on arrays where the
> disk performance is higher there may be other limits you are running into.
>
> David Lang
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Smith 2010-03-09 07:00:50 Re: 10K vs 15k rpm for analytics
Previous Message Scott Carey 2010-03-08 23:50:24 Re: 10K vs 15k rpm for analytics