Re: best use of an EMC SAN

From: Chris Browne <cbbrowne(at)acm(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: best use of an EMC SAN
Date: 2007-07-11 17:39:39
Message-ID: 60zm22x2jo.fsf@dba2.int.libertyrms.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

pg(at)fastcrypt(dot)com (Dave Cramer) writes:
> On 11-Jul-07, at 10:05 AM, Gregory Stark wrote:
>
>> "Dave Cramer" <pg(at)fastcrypt(dot)com> writes:
>>
>>> Assuming we have 24 73G drives is it better to make one big
>>> metalun and carve
>>> it up and let the SAN manage the where everything is, or is it
>>> better to
>>> specify which spindles are where.
>>
>> This is quite a controversial question with proponents of both
>> strategies.
>>
>> I would suggest having one RAID-1 array for the WAL and throw the
>> rest of the
>
> This is quite unexpected. Since the WAL is primarily all writes,
> isn't a RAID 1 the slowest of all for writing ?

The thing is, the disk array caches this LIKE CRAZY. I'm not quite
sure how many batteries are in there to back things up; there seems to
be multiple levels of such, which means that as far as fsync() is
concerned, the data is committed very quickly even if it takes a while
to physically hit disk.

One piece of the controversy will be that the disk being used for WAL
is certain to be written to as heavily and continuously as your heavy
load causes. A fallout of this is that those disks are likely to be
worked harder than the disk used for storing "plain old data," with
the result that if you devote disk to WAL, you'll likely burn thru
replacement drives faster there than you do for the "POD" disk.

It is not certain whether it is more desirable to:
a) Spread that wear and tear across the whole array, or
b) Target certain disks for that wear and tear, and expect to need to
replace them somewhat more frequently.

At some point, I'd like to do a test on a decent disk array where we
take multiple configurations. Assuming 24 drives:

- Use all 24 to make "one big filesystem" as the base case
- Split off a set (6?) for WAL
- Split off a set (6? 9?) to have a second tablespace, and shift
indices there

My suspicion is that the "use all 24 for one big filesystem" scenario
is likely to be fastest by some small margin, and that the other cases
will lose a very little bit in comparison. Andrew Sullivan had a
somewhat similar finding a few years ago on some old Solaris hardware
that unfortunately isn't at all relevant today. He basically found
that moving WAL off to separate disk didn't affect performance
materially.

What's quite regrettable is that it is almost sure to be difficult to
construct a test that, on a well-appointed modern disk array, won't
basically stay in cache.
--
let name="cbbrowne" and tld="acm.org" in name ^ "@" ^ tld;;
http://linuxdatabases.info/info/nonrdbms.html
16-inch Rotary Debugger: A highly effective tool for locating problems
in computer software. Available for delivery in most major
metropolitan areas. Anchovies contribute to poor coding style.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jim Nasby 2007-07-11 17:58:12 Re: best use of an EMC SAN
Previous Message Tom Lane 2007-07-11 17:07:19 Re: PostgreSQL publishes first real benchmark