Skip site navigation (1) Skip section navigation (2)

Re: With 4 disks should I go for RAID 5 or RAID 10

From: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
To: Shane Ambler <pgsql(at)Sheeky(dot)Biz>
Cc: Fernando Hevia <fhevia(at)ip-tel(dot)com(dot)ar>, 'pgsql-performance' <pgsql-performance(at)postgresql(dot)org>
Subject: Re: With 4 disks should I go for RAID 5 or RAID 10
Date: 2007-12-27 04:36:27
Message-ID: 47732BCB.2090302@mark.mielke.cc (view raw or flat)
Thread:
Lists: pgsql-performance
Shane Ambler wrote:
> So in theory a modern RAID 1 setup can be configured to get similar 
> read speeds as RAID 0 but would still drop to single disk speeds (or 
> similar) when writing, but RAID 0 can get the faster write performance.

Unfortunately, it's a bit more complicated than that. RAID 1 has a 
sequential read problem, as read-ahead is wasted, and you may as well 
read from one disk and ignore the others. RAID 1 does, however, allows 
for much greater concurrency. 4 processes on a 4 disk RAID 1 system can, 
theoretically, each do whatever they want, without impacting each other. 
Database loads involving a single active read user will see greater 
performance with RAID 0. Database loads involving multiple concurrent 
active read users will see greater performance with RAID 1. All of these 
assume writes are not being performed to any great significance. Even 
single writes cause all disks in a RAID 1 system to synchronize 
temporarily eliminating the read benefit. RAID 0 allows some degree of 
concurrent reads and writes occurring at the same time (assuming even 
distribution of the data across the devices). Of course, RAID 0 systems 
have an expected life that decreases as the number of disks in the 
system increase.

So, this is where we get to RAID 1+0. Redundancy, good read performance, 
good write performance, relatively simple implementation. For a mere 
cost of double the number of disk storage,
you can get around the problems of RAID 1 and the problems of RAID 0. :-)

> So in a perfect setup (probably 1+0) 4x 300MB/s SATA drives could 
> deliver 1200MB/s of data to RAM, which is also assuming that all 4 
> channels have their own data path to RAM and aren't sharing.
> (anyone know how segregated the on board controllers such as these are?)
> (do some pci controllers offer better throughput?)
> We all know that doesn't happen in the real world ;-) Let's say we are 
> restricted to 80% - 1000MB/s - and some of that (10%) gets used by the 
> system - so we end up with 900MB/s delivered off disk to postgres - 
> that would still be more than the perfect rate at which 2x 300MB/s 
> drives can deliver.

I expect you would have to have good hardware, and a well tuned system 
to see 80%+ theoretical for common work loads. But then, this isn't 
unique to RAID. Even in a single disk system, one has trouble achieving 
80%+ theoretical. :-)

I achieve something closer to +20% - +60% over the theoretical 
performance of a single disk with my four disk RAID 1+0 partitions. Lots 
of compromises in my system though that I won't get into. For me, I 
value the redundancy, allowing for a single disk to fail and giving me 
time to easily recover, but for the cost of two more disks, I am able to 
counter the performance cost of redundancy, and actually see a positive 
performance effect instead.

> So in this situation - if configured correctly with a good controller 
> (driver for software RAID etc) a single 4 disk RAID 1+0 could 
> outperform two 2 disk RAID 1 setups with data/OS+WAL split between the 
> two.
> Is the real world speeds so different that this theory is real fantasy 
> or has hardware reached a point performance wise where this is close 
> to fact??
I think it depends on the balance. If every second operation requires a 
WAL write, having separate might make sense. However, if the balance is 
less than even, one would end up with one of the 2 disk RAID 1 setups 
being more idle than the other. It's not an exact science when it comes 
to the various compromises being made. :-)

If you can only put 4 disks in to the system (either cost, or because of 
the system size), I would suggest RAID 1+0 on all four as sensible 
compromise. If you can put more in - start to consider breaking it up. :-)

Cheers,
mark

-- 
Mark Mielke <mark(at)mielke(dot)cc>

In response to

Responses

pgsql-performance by date

Next:From: Tom LaneDate: 2007-12-27 06:10:29
Subject: Re: More shared buffers causes lower performances
Previous:From: Greg SmithDate: 2007-12-27 04:10:46
Subject: Re: With 4 disks should I go for RAID 5 or RAID 10

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group