Re: SAN performance mystery

From: "John Vincent" <pgsql-performance(at)lusis(dot)org>
To: "Tim Allen" <tim(at)proximity(dot)com(dot)au>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: SAN performance mystery
Date: 2006-06-15 22:15:38
Message-ID: c841561b0606151515l52a630c9l9ca3373e2274bdba@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 6/15/06, Tim Allen <tim(at)proximity(dot)com(dot)au> wrote:
>
> <snipped>
> Is that expected performance, anyone? It doesn't sound right to me. Does
> anyone have any clues about what might be going on? Buggy kernel
> drivers? Buggy kernel, come to think of it? Does a SAN just not provide
> adequate performance for a large database?
>
> I'd be grateful for any clues anyone can offer,
>
> Tim

Tim,

Here are the areas I would look at first if we're considering hardware to be
the problem:

HBA and driver:
Since this is a Intel/Linux system, the HBA is PROBABLY a qlogic. I would
need to know the SAN model to see what the backend of the SAN is itself. EMC
has some FC-attach models that actually have SATA disks underneath. You also
might want to look at the cache size of the controllers on the SAN.
- Something also to note is that EMC provides a add-on called PowerPath
for load balancing multiple HBAs. If they don't have this, it might be worth
investigating.
- As with anything, disk layout is important. With the lower end IBM SAN
(DS4000) you actually have to operate on physical spindle level. On our
4300, when I create a LUN, I select the exact disks I want and which of the
two controllers are the preferred path. On our DS6800, I just ask for
storage. I THINK all the EMC models are the "ask for storage" type of
scenario. However with the 6800, you select your storage across extent
pools.

Have they done any benchmarking of the SAN outside of postgres? Before we
settle on a new LUN configuration, we always do the dd,umount,mount,dd
routine. It's not a perfect test for databases but it will help you catch
GROSS performance issues.

SAN itself:
- Could the SAN be oversubscribed? How many hosts and LUNs total do they
have and what are the queue_depths for those hosts? With the qlogic card,
you can set the queue depth in the BIOS of the adapter when the system is
booting up. CTRL-Q I think. If the system has enough local DASD to relocate
the database internally, it might be a valid test to do so and see if you
can isolate the problem to the SAN itself.

PG itself:

If you think it's a pgsql configuration, I'm guessing you already
configured postgresql.conf to match thiers (or at least a fraction of thiers
since the memory isn't the same?). What about loading a "from-scratch"
config file and restarting the tuning process?

Just a dump of my thought process from someone who's been spending too much
time tuning his SAN and postgres lately.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jim C. Nasby 2006-06-15 22:23:45 Re: Optimizer internals
Previous Message Brian Hurt 2006-06-15 22:02:04 Re: SAN performance mystery