Re: Contemplating SSD Hardware RAID

From: Yeb Havinga <yebhavinga(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndQuadrant(dot)com>
Cc: Florian Weimer <fweimer(at)bfk(dot)de>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Contemplating SSD Hardware RAID
Date: 2011-06-21 20:10:35
Message-ID: 4E00FABB.6030302@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 2011-06-21 17:11, Greg Smith wrote:
> On 06/21/2011 07:19 AM, Florian Weimer wrote:
>> 3ware controllers offer SMART pass-through, and smartctl supports it.
>> I'm sure there's something similar for Areca controllers.
>
> Depends on the model, drives, and how you access the management
> interface. For both manufacturers actually. Check out
> http://notemagnet.blogspot.com/2008/08/linux-disk-failures-areca-is-not-so.html
> for example. There I talk about problems with a specific Areca
> controller, as well as noting in a comment at the end that there are
> limitations with 3ware supporting not supporting SMART reports against
> SAS drives.
>
> Part of the whole evaluation chain for new server hardware, especially
> for SSD, needs to be a look at what SMART data you can get. Yeb, I'd
> be curious to get more details about what you've been seeing here if
> you can share it. You have more different models around than I have
> access to, especially the OCZ ones which I can't get my clients to
> consider still. (Their concerns about compatibility and support from
> a relatively small vendor are not completely unfounded)
>

This is what a windows OCZ tool explains about the different smart
values (excuse for no mark up) for a Vertex 2 Pro.

SMART READ DATA
Revision: 10
Attributes List
1: SSD Raw Read Error Rate Normalized Rate: 120
total ECC and RAISE errors
5: SSD Retired Block Count Reserve blocks
remaining: 100%
9: SSD Power-On Hours Total hours power on: 451
12: SSD Power Cycle Count Count of power on/off
cycles: 61
13: SSD Soft Read Error Rate Normalized Rate: 120
100: SSD GBytes Erased Flash memory erases
across the entire drive: 128 GB
170: SSD Number of Remaining Spares Number of reserve Flash
memory blocks: 17417
171: SSD Program Fail Count Total number of Flash
program operation failures: 0
172: SSD Erase Fail Count Total number of Flash
erase operation failures: 0
174: SSD Unexpected power loss count Total number of
unexpected power loss: 13
177: SSD Wear Range Delta Delta between most-worn
and least-worn Flash blocks: 0
181: SSD Program Fail Count Total number of Flash
program operation failures: 0
182: SSD Erase Fail Count Total number of Flash
erase operation failures: 0
184: SSD End to End Error Detection I/O errors detected
during reads from flash memory: 0
187: SSD Reported Uncorrectable Errors Uncorrectable RAISE
errors reported to the host for all data access: 0
194: SSD Temperature Monitoring Current: 26 High: 37
Low: 0
195: SSD ECC On-the-fly Count Normalized Rate: 120
196: SSD Reallocation Event Count Total number of
reallocated Flash blocks: 0
198: SSD Uncorrectable Sector Count Total number of
uncorrectable errors when reading/writing a sector: 0
199: SSD SATA R-Errors Error Count Current SATA RError
count: 0
201: SSD Uncorrectable Soft Read Error Rate Normalized Rate: 120
204: SSD Soft ECC Correction Rate (RAISE) Normalized Rate: 120
230: SSD Life Curve Status Current state of drive
operation based upon the Life Curve: 100
231: SSD Life Left Approximate SDD life
Remaining: 99%
232: SSD Available Reserved Space Amount of Flash memory
space in reserve (GB): 17
235: SSD Supercap Health Condition of an
external SuperCapacitor Health in mSec: 0
241: SSD Lifetime writes from host Number of bytes written
to SSD: 448 GB
242: SSD Lifetime reads from host Number of bytes read
from SSD: 192 GB

Same tool for a Vertex 3 (not pro)

SMART READ DATA
Revision: 10
Attributes List
1: SSD Raw Read Error Rate Normalized Rate: 120
total ECC and RAISE errors
5: SSD Retired Block Count Reserve blocks
remaining: 100%
9: SSD Power-On Hours Total hours power on: 7
12: SSD Power Cycle Count Count of power on/off
cycles: 13
171: SSD Program Fail Count Total number of Flash
program operation failures: 0
172: SSD Erase Fail Count Total number of Flash
erase operation failures: 0
174: SSD Unexpected power loss count Total number of
unexpected power loss: 10
177: SSD Wear Range Delta Delta between most-worn
and least-worn Flash blocks: 0
181: SSD Program Fail Count Total number of Flash
program operation failures: 0
182: SSD Erase Fail Count Total number of Flash
erase operation failures: 0
187: SSD Reported Uncorrectable Errors Uncorrectable RAISE
errors reported to the host for all data access: 0
194: SSD Temperature Monitoring Current: 128 High: 129
Low: 127
195: SSD ECC On-the-fly Count Normalized Rate: 100
196: SSD Reallocation Event Count Total number of
reallocated Flash blocks: 0
201: SSD Uncorrectable Soft Read Error Rate Normalized Rate: 100
204: SSD Soft ECC Correction Rate (RAISE) Normalized Rate: 100
230: SSD Life Curve Status Current state of drive
operation based upon the Life Curve: 100
231: SSD Life Left Approximate SDD life
Remaining: 100%
241: SSD Lifetime writes from host Number of bytes written
to SSD: 162 GB
242: SSD Lifetime reads from host Number of bytes read
from SSD: 236 GB

There's some info burried in
http://archives.postgresql.org/pgsql-performance/2011-03/msg00350.php
where two Vertex 2 pro's are compared; the first has been really
hammered with pgbench, the second had a few months duty in a
workstation. The raw value of SSD Available Reserved Space seems to be a
good candidate to watch to go to 0, since the pgbenched-drive has 16GB
left and the workstation disk 17GB. Would be cool to graph with e.g.
symon (http://i.imgur.com/T4NAq.png)

--
Yeb Havinga
http://www.mgrid.net/
Mastering Medical Data

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Yeb Havinga 2011-06-21 20:25:47 Re: Contemplating SSD Hardware RAID
Previous Message Shaun Thomas 2011-06-21 19:49:46 Re: seq scan in the case of max() on the primary key column