Skip site navigation (1) Skip section navigation (2)

Re: SAN performance mystery

From: Tim Allen <tim(at)proximity(dot)com(dot)au>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: SAN performance mystery
Date: 2006-06-16 09:11:01
Message-ID: 449275A5.5090503@proximity.com.au (view raw or flat)
Thread:
Lists: pgsql-performance
Tim Allen wrote:
> We have a customer who are having performance problems. They have a 
> large (36G+) postgres 8.1.3 database installed on an 8-way opteron with 
> 8G RAM, attached to an EMC SAN via fibre-channel (I don't have details 
> of the EMC SAN model, or the type of fibre-channel card at the moment). 
> They're running RedHat ES3 (which means a 2.4.something Linux kernel).

> To simplify greatly - single local SATA disk beats EMC SAN by factor of 
> four.
> 
> Is that expected performance, anyone? It doesn't sound right to me. Does 
> anyone have any clues about what might be going on? Buggy kernel 
> drivers? Buggy kernel, come to think of it? Does a SAN just not provide 
> adequate performance for a large database?
> 
> I'd be grateful for any clues anyone can offer,
> 
> Tim

Thanks to all who have replied so far. I've learned a few new things in 
the meantime.

Firstly, the fibrechannel card is an Emulex LP1050. The customer seems 
to have rather old drivers for it, so I have recommended that they 
upgrade asap. I've also suggested they might like to upgrade their 
kernel to something recent too (eg upgrade to RHEL4), but no telling 
whether they'll accept that recommendation.

The fact that SATA drives are wont to lie about write completion, which 
several posters have pointed out, presumably has an effect on write 
performance (ie apparent write performance is increased at the cost of 
an increased risk of data-loss), but, again presumably, not much of an 
effect on read performance. After loading the customer's database on our 
fairly modest box with the single SATA disk, we also tested select query 
performance, and while we didn't see a factor of four gain, we certainly 
saw that read performance is also substantially better. So the fsync 
issue possibly accounts for part of our factor-of-four, but not all of 
it. Ie, the SAN is still not doing well by comparison, even allowing for 
the presumption that it is more honest.

One curious thing is that some postgres backends seem to spend an 
inordinate amount of time in uninterruptible iowait state. I found a 
posting to this list from December 2004 from someone who reported that 
very same thing. For example, bringing down postgres on the customer box 
requires kill -9, because there are invariably one or two processes so 
deeply uninterruptible as to not respond to a politer signal. That 
indicates something not quite right, doesn't it?

Tim

-- 
-----------------------------------------------
Tim Allen          tim(at)proximity(dot)com(dot)au
Proximity Pty Ltd  http://www.proximity.com.au/

In response to

Responses

pgsql-performance by date

Next:From: Michael StoneDate: 2006-06-16 11:23:04
Subject: Re: how to partition disks
Previous:From: David LeangenDate: 2006-06-16 08:12:22
Subject: Re: Delete operation VERY slow...

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group