Re: New server: SSD/RAID recommendations?

From: "Graeme B(dot) Bell" <graeme(dot)bell(at)nibio(dot)no>
To: "Wes Vaske (wvaske)" <wvaske(at)micron(dot)com>
Cc: "Graeme B(dot) Bell" <graeme(dot)bell(at)nibio(dot)no>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: New server: SSD/RAID recommendations?
Date: 2015-07-07 15:53:43
Message-ID: 66C9C5BA-FBF8-406B-8666-D6A75D704992@skogoglandskap.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


Hi Wes

1. The first interesting thing is that prior to my mentioning this problem to C_____ a year or two back, the power loss protection was advertised everywhere as simply that, without qualifiers about 'not inflight data'. Check out the marketing of the M500 for the first year or so and try to find an example where they say 'but inflight data isn't protected!'.

2. The second (and more important) interesting thing is that this is irrelevant!

Fsync'd data is BY DEFINITION not data in flight.
Fsync means "This data is secure on the disk!"
However, the drives corrupt it.

Postgres's sanity depends on a reliable fsync. That's why we see posts on the performance list saying 'fsync=no makes your postgres faster but really, don't do it in production".
We are talking about internal DB corruption, not just a crash and a few lost transactions.

These drives return from fsync while data is still in volatile cache.
That's breaking the spec, and it's why they are not OK for postgres by themselves.

This is not about 'in-flight' data, it's about fsync'd wal log data.

Graeme.

On 07 Jul 2015, at 16:15, Wes Vaske (wvaske) <wvaske(at)micron(dot)com> wrote:

> The M500/M550/M600 are consumer class drives that don't have power protection for all inflight data.* (like the Samsung 8x0 series and the Intel 3x0 & 5x0 series).
>
> The M500DC has full power protection for inflight data and is an enterprise-class drive (like the Samsung 845DC or Intel S3500 & S3700 series).
>
> So any drive without the capacitors to protect inflight data will suffer from data loss if you're using disk write cache and you pull the power.
>
> *Big addendum:
> There are two issues on powerloss that will mess with Postgres. Data Loss and Data Corruption. The micron consumer drives will have power loss protection against Data Corruption and the enterprise drive will have power loss protection against BOTH.
>
> https://www.micron.com/~/media/documents/products/white-paper/wp_ssd_power_loss_protection.pdf
>
> The Data Corruption problem is only an issue in non-SLC NAND but it's industry wide. And even though some drives will protect against that, the protection of inflight data that's been fsync'd is more important and should disqualify *any* consumer drives from *any* company from consideration for use with Postgres.
>
> Wes Vaske | Senior Storage Solutions Engineer
> Micron Technology
>
> -----Original Message-----
> From: Graeme B. Bell [mailto:graeme(dot)bell(at)nibio(dot)no]
> Sent: Tuesday, July 07, 2015 8:26 AM
> To: Merlin Moncure
> Cc: Wes Vaske (wvaske); Craig James; pgsql-performance(at)postgresql(dot)org
> Subject: Re: [PERFORM] New server: SSD/RAID recommendations?
>
>
> As I have warned elsewhere,
>
> The M500/M550 from $SOME_COMPANY is NOT SUITABLE for postgres unless you have a RAID controller with BBU to protect yourself.
> The M500/M550 are NOT plug-pull safe despite the 'power loss protection' claimed on the packaging. Not all fsync'd data is preserved in the event of a power loss, which completely undermines postgres's sanity.
>
> I would be extremely skeptical about the M500DC given the name and manufacturer.
>
> I went to quite a lot of trouble to provide $SOME_COMPANYs engineers with the full details of this fault after extensive testing (we have e.g. 20-25 of these disks) on multiple machines and controllers, at their request. Result: they stopped replying to me, and soon after I saw their PR reps talking about how 'power loss protection isn't about protecting all data during a power loss'.
>
> The only safe way to use an M500/M550 with postgres is:
>
> a) disable the disk cache, which will cripple performance to about 3-5% of normal.
> b) use a battery backed or cap-backed RAID controller, which will generally hurt performance, by limiting you to the peak performance of the flash on the raid controller.
>
> If you are buying such a drive, I strongly recommend buying only one and doing extensive plug pull testing before commiting to several.
> For myself, my time is valuable enough that it will be cheaper to buy intel in future.
>
> Graeme.
>
> On 07 Jul 2015, at 15:12, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>
>> On Thu, Jul 2, 2015 at 1:00 PM, Wes Vaske (wvaske) <wvaske(at)micron(dot)com> wrote:
>> Storage Review has a pretty good process and reviewed the M500DC when it released last year. http://www.storagereview.com/micron_m500dc_enterprise_ssd_review
>>
>>
>>
>> The only database-specific info we have available are for Cassandra and MSSQL:
>>
>> http://www.micron.com/~/media/documents/products/technical-marketing-brief/cassandra_and_m500dc_enterprise_ssd_tech_brief.pdf
>>
>> http://www.micron.com/~/media/documents/products/technical-marketing-brief/sql_server_2014_and_m500dc_raid_configuration_tech_brief.pdf
>>
>>
>>
>> (some of that info might be relevant)
>>
>>
>>
>> In terms of endurance, the M500DC is rated to 2 Drive Writes Per Day (DWPD) for 5-years. For comparison:
>>
>> Micron M500DC (20nm) - 2 DWPD
>>
>> Intel S3500 (20nm) - 0.3 DWPD
>>
>> Intel S3510 (16nm) - 0.3 DWPD
>>
>> Intel S3710 (20nm) - 10 DWPD
>>
>>
>>
>> They're all great drives, the question is how write-intensive is the workload.
>>
>>
>>
>>
>> Intel added a new product, the 3610, that is rated for 3 DWPD. Pricing looks to be around 1.20$/GB.
>>
>> merlin
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Graeme B. Bell 2015-07-07 15:58:49 Re: New server: SSD/RAID recommendations?
Previous Message Mike Broers 2015-07-07 15:40:48 wildcard text filter switched to boolean column, performance is way worse