Re: SAN performance mystery

From: "John Vincent" <pgsql-performance(at)lusis(dot)org>
To: "Tim Allen" <tim(at)proximity(dot)com(dot)au>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: SAN performance mystery
Date: 2006-06-19 12:54:18
Message-ID: c841561b0606190554of25d05fgc8b6e3f950910236@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 6/19/06, Tim Allen <tim(at)proximity(dot)com(dot)au> wrote:
>
>
> As I noted in another thread, the HBA is an Emulex LP1050, and they have
> a rather old driver for it. I've recommended that they update ASAP. This
> hasn't happened yet.

Yeah, I saw that in a later thread. I would suggest also that the BIOS
settings on the HBA itself have been investigated. An example is the Qlogic
HBAs have a profile of sorts, one for tape and one for disk. Could be
something there.

> OK, thanks, I'll ask the customer whether they've used PowerPath at all.
> They do seem to have it installed on the machine, but I suppose that
> doesn't guarantee it's being used correctly. However, it looks like they
> have just the one HBA, so, if I've correctly understood what load
> balancing means in this context, it's not going to help; right?

If they have a single HBA then no it won't help. I'm not very intimate on
powerpath but it might even HURT if they have it enabled with one HBA. As an
example, we were in the process of migrating an AIX LPAR to our DS6800. We
only had one spare HBA to assign it. The default policy with the SDD driver
is lb (load balancing). The problem is that with the SDD driver you see
multiple hdisks per HBA per controller port on the SAN. Since we had 4
controller ports active on the SAN, our HBA saw 4 hdisks per LUN. The SDD
driver abstracts that out as a single vpath and you use the vpaths as your
pv on the system. The problem was that it was attempting to load balance
across a single hba which was NOT what we wanted.

>
> I've done some dd'ing myself, as described in another thread. The
> results are not at all encouraging - their SAN seems to do about 20MB/s
> or less.

I saw that as well.

> The SAN possibly is over-subscribed. Can you suggest any easy ways for
> me to find out? The customer has an IT department who look after their
> SANs, and they're not keen on outsiders poking their noses in. It's hard
> for me to get any direct access to the SAN itself.

When I say over-subscribed, you have to look at all the active LUNs and all
of the systems attached as well. With the DS4300 (standard not turbo
option), the SAN can handle 512 I/Os per second. If I have 4 LUNs assigned
to four systems (1 per system), and each LUN has a queue_depth of 128 from
each system, I''ll oversubscribe with the next host attach unless I back the
queue_depth off on each host. Contrast that with the Turbo controller option
which does 1024 I/Os per sec and I can duplicate what I have now or add a
second LUN per host. I can't even find how much our DS6800 supports.

> Thanks for all the suggestions, John. I'll keep trying to follow some of
> them up.

From what I can tell, it sounds like the SATA problem other people have
mentioned sounds like the culprit.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message John Vincent 2006-06-19 12:58:47 Re: SAN performance mystery
Previous Message Stephen Frost 2006-06-19 12:41:54 Re: SAN performance mystery