Re: Intel SSDs that may not suck

From: Jeff <threshar(at)torgo(dot)978(dot)org>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Andy <angelflow(at)yahoo(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>
Subject: Re: Intel SSDs that may not suck
Date: 2011-03-29 14:16:51
Message-ID: 8BE8F356-319F-4BE7-BE66-F45D50235985@torgo.978.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On Mar 29, 2011, at 12:13 AM, Merlin Moncure wrote:

>
> My own experience with MLC drives is that write cycle expectations are
> more or less as advertised. They do go down (hard), and have to be
> monitored. If you are writing a lot of data this can get pretty
> expensive although the cost dynamics are getting better and better for
> flash. I have no idea what would be precisely prudent, but maybe some
> good monitoring tools and phased obsolescence at around 80% duty cycle
> might not be a bad starting point. With hard drives, you can kinda
> wait for em to pop and swap em in -- this is NOT a good idea for flash
> raid volumes.

we've been running some of our DB's on SSD's (x25m's, we also have a
pair of x25e's in another box we use for some super hot tables). They
have been in production for well over a year (in some cases, nearly a
couple years) under heavy load.

We're currently being bit in the ass by performance degradation and
we're working out plans to remedy the situation. One box has 8 x25m's
in a R10 behind a P400 controller. First, the p400 is not that
powerful and we've run experiments with newer (p812) controllers that
have been generally positive. The main symptom we've been seeing is
write stalls. Writing will go, then come to a complete halt for 0.5-2
seconds, then resume. The fix we're going to do is replace each
drive in order with the rebuild occuring between each. Then we do a
security erase to reset the drive back to completely empty (including
the "spare" blocks kept around for writes).

Now that all sounds awful and horrible until you get to overall
performance, especially with reads - you are looking at 20k random
reads per second with a few disks. Adding in writes does kick it down
a noch, but you're still looking at 10k+ iops. That is the current
trade off.

In general, i wouldn't recommend the cciss stuff with SSD's at this
time because it makes some things such as security erase, smart and
other things near impossible. (performance seems ok though) We've got
some tests planned seeing what we can do with an Areca controller and
some ssds to see how it goes.

Also note that there is a funky interaction with an MSA70 and SSDs.
they do not work together. (I'm not sure if HP's official branded
ssd's have the same issue).

The write degradation could probably be monitored looking at svctime
from sar. We may be implementing that in the near future to detect
when this creeps up again.

--
Jeff Trout <jeff(at)jefftrout(dot)com>
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Cédric Villemain 2011-03-29 14:30:59 Re: Intel SSDs that may not suck
Previous Message Yeb Havinga 2011-03-29 10:34:08 Re: Intel SSDs that may not suck