Skip site navigation (1) Skip section navigation (2)

Re: Anyone using a SAN?

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Tobias Brox <tobias(at)nordicbet(dot)com>
Cc: Peter Koczan <pjkoczan(at)gmail(dot)com>, pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Anyone using a SAN?
Date: 2008-02-13 23:02:17
Message-ID: Pine.GSO.4.64.0802131740410.24079@westnet.com (view raw or flat)
Thread:
Lists: pgsql-performance
On Wed, 13 Feb 2008, Tobias Brox wrote:

> What I'm told is that the state-of-the-art SAN allows for an "insane 
> amount" of hard disks to be installed, much more than what would fit 
> into any decent database server.

You can attach a surpringly large number of drives to a server nowadays, 
but in general it's easier to manage larger numbers of them on a SAN. 
Also, there are significant redundancy improvements using a SAN that are 
worth quite a bit in some enterprise environments.  Being able to connect 
all the drives, no matter how many, to two or more machines at once 
trivially is typically easier to setup on a SAN than when you're using 
more direct storage.

Basically the performance breaks down like this:

1) Going through the SAN interface (fiber channel etc.) introduces some 
latency and a potential write bottleneck compared with direct storage, 
everything else being equal.  This can really be a problem if you've got a 
poor SAN vendor or interface issues you can't sort out.

2) It can be easier to manage a large number of disks in the SAN, so for 
situations where aggregate disk throughput is the limiting factor the SAN 
solution might make sense.

3) At the high-end, you can get SANs with more cache than any direct 
controller I'm aware of, which for some applications can lead to them 
having a more quantifiable lead over direct storage.  It's easy (albeit 
expensive) to get an EMC array with 16GB worth of memory for caching on it 
for example (and with 480 drives).  And since they've got a more robust 
power setup than a typical server, you can even enable all the individual 
drive caches usefully (that's 16-32MB each nowadays, so at say 100 disks 
you've potentially got another 1.6GB of cache right there).  If you're got 
a typical server you can end up needing to turn off individual direct 
attached drive caches for writes, because they many not survive a power 
cycle even with a UPS, and you have to just rely on the controller write 
cache.

There's no universal advantage on either side here, just a different set 
of trade-offs.  Certainly you'll never come close to the performance/$ 
direct storage gets you if you buy that in SAN form instead, but at higher 
budgets or feature requirements they may make sense anyway.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Responses

pgsql-performance by date

Next:From: Josh BerkusDate: 2008-02-13 23:07:13
Subject: Re: HOT TOAST?
Previous:From: Arjen van der MeijdenDate: 2008-02-13 22:20:57
Subject: Re: Anyone using a SAN?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group