Re: Which RAID Controllers to pick/avoid?

From: david(at)lang(dot)hm
To: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
Cc: Royce Ausburn <royce(dot)ml(at)inomial(dot)com>, Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>, Dan Birken <birken(at)gmail(dot)com>, pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Which RAID Controllers to pick/avoid?
Date: 2011-02-06 12:15:39
Message-ID: alpine.DEB.2.00.1102060412580.8162@asgard.lang.hm
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sun, 6 Feb 2011, Scott Marlowe wrote:

> On Sun, Feb 6, 2011 at 2:39 AM, Royce Ausburn <royce(dot)ml(at)inomial(dot)com> wrote:
>>
>>> On Wed, Feb 2, 2011 at 7:00 PM, Craig Ringer
>>> <craig(at)postnewspapers(dot)com(dot)au> wrote:
>>>> Whatever RAID controller you get, make sure you have a battery backup
>>>> unit (BBU) installed so you can safely enable write-back caching.
>>>> Without that, you might as well use software RAID - it'll generally be
>>>> faster (and cheaper) than HW RAID w/o a BBU.
>>>
>>> Recently we had to pull our RAID controllers and go to plain SAS
>>> cards.  While random access dropped a bit, sequential throughput
>>> skyrocketed, saturating the 4 lane cable we use.    4x300Gb/s =
>>> 1200Gb/s or right around 1G of data a second off the array.  VERY
>>> impressive.
>>
>>
>> This is really surprising.  Software raid generally outperform hardware
>> raid without BBU?  Why is that?  My company uses hardware raid quite a
>> bit without BBU and have never thought to compare with software raid =/
>
> For raw throughtput it's not uncommon to beat a RAID card whether it
> has a battery backed cache or not. If I'm wiriting a 200G file to the
> disks, a BBU cache isn't gonna make that any faster, it'll fill up in
> a second and then it's got to write to disk. BBU Cache are for faster
> random writes, and will handily beat SW RAID. But for raw large file
> read and write SW RAID is the fastest thing I've seen.
>

keep in mind that hardware raide with BBU is safer than software raid.

since the updates to the drives do not all happen at the same time, there
is a chance that a write to software raid may have happened on some drives
and not others when the system crashes.

with hardware raid and BBU, the controller knows what it was trying to
write where, and if it didn't get the scknowledgement, it will complete
the write when it comes up again.

but with software raid you will have updates some part of the array and
not others. this will result in a corrupted stripe in the array.

David Lang
>From pgsql-performance-owner(at)postgresql(dot)org Sun Feb 6 09:04:39 2011
Received: from maia.hub.org (maia-5.hub.org [200.46.204.29])
by mail.postgresql.org (Postfix) with ESMTP id C84291336B42
for <pgsql-performance-postgresql(dot)org(at)mail(dot)postgresql(dot)org>; Sun, 6 Feb 2011 09:04:38 -0400 (AST)
Received: from mail.postgresql.org ([200.46.204.86])
by maia.hub.org (mx1.hub.org [200.46.204.29]) (amavisd-maia, port 10024)
with ESMTP id 14200-03
for <pgsql-performance-postgresql(dot)org(at)mail(dot)postgresql(dot)org>;
Sun, 6 Feb 2011 13:04:32 +0000 (UTC)
X-Greylist: from auto-whitelisted by SQLgrey-1.7.6
Received: from locust.cns.vt.edu (locust.cns.vt.edu [198.82.169.14])
by mail.postgresql.org (Postfix) with ESMTP id 228971337B49
for <pgsql-performance(at)postgresql(dot)org>; Sun, 6 Feb 2011 09:04:31 -0400 (AST)
Received: by locust.cns.vt.edu (Postfix, from userid 986)
id 55FC2118DD1; Sun, 6 Feb 2011 08:04:30 -0500 (EST)
Date: Sun, 6 Feb 2011 08:04:30 -0500
From: Ray Stell <stellr(at)cns(dot)vt(dot)edu>
To: felix <crucialfelix(at)gmail(dot)com>
Cc: sthomas(at)peak6(dot)com,
"pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Really really slow select count(*)
Message-ID: <20110206130430(dot)GC24627(at)cns(dot)vt(dot)edu>
References: <AANLkTinAJcm5A+i1O_R87aiYyu8N2OWgoaL9RfR=tUR9(at)mail(dot)gmail(dot)com> <4D4C138C(dot)7010304(at)2ndquadrant(dot)com> <4D4C1555(dot)40209(at)peak6(dot)com> <AANLkTikrS5JkJDMjAXpcRWRnuLbX+MHB=CWb_mFa9qb-(at)mail(dot)gmail(dot)com> <4D4C2AC3(dot)5020804(at)peak6(dot)com> <AANLkTin8ndBSU9ZJ2uAdnvYQ7oLBg2S04bqAXn-s5qv+(at)mail(dot)gmail(dot)com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <AANLkTin8ndBSU9ZJ2uAdnvYQ7oLBg2S04bqAXn-s5qv+(at)mail(dot)gmail(dot)com>
User-Agent: Mutt/1.5.17 (2007-11-01)
X-Virus-Scanned: Maia Mailguard 1.0.1
X-Spam-Status: No, hits=-1.91 tagged_above=-10 required=5 tests=BAYES_00=-1.9,
T_RP_MATCHES_RCVD=-0.01
X-Spam-Level:
X-Archive-Number: 201102/253
X-Sequence-Number: 42391

On Sun, Feb 06, 2011 at 11:48:50AM +0100, felix wrote:
> BRUTAL
>

Did the changes work in your test environment?

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2011-02-06 15:23:12 Re: Really really slow select count(*)
Previous Message felix 2011-02-06 10:48:50 Re: Really really slow select count(*)