Re: Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD

From: Scott Carey <scott(at)richrelevance(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Michael March <mmarch(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD
Date: 2010-08-11 23:05:35
Message-ID: 72E29316-5F4C-4A01-8924-5340DF2FAE65@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On Aug 10, 2010, at 9:21 AM, Greg Smith wrote:

> Scott Carey wrote:
>> Also, the amount of data at risk in a power loss varies between
>> drives. For Intel's drives, its a small chunk of data ( < 256K). For
>> some other drives, the cache can be over 30MB of outstanding writes.
>> For some workloads this is acceptable
>
> No, it isn't ever acceptable. You can expect the type of data loss you
> get when a cache fails to honor write flush calls results in
> catastrophic database corruption. It's not "I lost the last few
> seconds";

I never said it was.

> it's "the database is corrupted and won't start" after a
> crash.

Which is sometimes acceptables. There is NO GUARANTEE that you won't lose data, ever. An increase in the likelihood is an acceptable tradeoff in some situations, especially when it is small. On ANY power loss event, with or without battery backed caches and such, you should do a consistency check on the system proactively. With less reliable hardware, that task becomes much more of a burden, and is much more likely to require restoring data from somewhere.

What is the likelihood that your RAID card fails, or that the battery that reported 'good health' only lasts 5 minutes and you lose data before power is restored? What is the likelihood of human error?
Not that far off from the likelihood of power failure in a datacenter with redundant power. One MUST have a DR plan. Never assume that your perfect hardware won't fail.

> This is why we pound on this topic on this list. A SSD that
> fails to honor flush requests is completely worthless for anything other
> than toy databases.

Overblown. Not every DB and use case is a financial application or business critical app. Many are not toys at all. Slave, read only DB's (or simply subset tablespaces) ...

Indexes. (per application, schema)
Tables. (per application, schema)
System tables / indexes.
WAL.

Each has different reliability requirement and consequences from losing recently written data. less than 8K can be fatal to the WAL, or table data. Corrupting some tablespaces is not a big deal. Corrupting others is catastrophic. The problem with the assertion that this hardware is worthless is that it implies that every user, every use case, is at the far end of the reliability requirement spectrum.

Yes, that can be a critical requirement for many, perhaps most, DB's. But there are many uses for slightly unsafe storage systems.

> You can expect significant work to recover any
> portion of your data after the first unexpected power loss under heavy
> write load in this environment, during which you're down. We do
> database corruption recovery at 2ndQuadrant; while I can't talk about
> the details of some recent incidents, I am not speaking theoretically
> when I warn about this.

I've done the single-user mode recover system tables by hand thing myself at 4AM, on a system with battery backed RAID 10, redundant power, etc. Raid cards die, and 10TB recovery times from backup are long.

Its a game of balancing your data loss tolerance with the likelihood of power failure. Both of these variables are highly variable, and not just with 'toy' dbs. If you know what you are doing, you can use 'fast but not completely safe' storage for many things safely. Chance of loss is NEVER zero, do not assume that 'good' hardware is flawless.

Imagine a common internet case where synchronous_commit=false is fine. Recovery from backups is a pain (but a daily snapshot is taken of the important tables, and weekly for easily recoverable other stuff). If you expect one power related failure every 2 years, it might be perfectly reasonable to use 'unsafe' SSD's in order to support high transaction load on the risk that that once every 2 year downtime is 12 hours long instead of 30 minutes, and includes losing up to a day's information. Applications like this exist all over the place.

> --
> Greg Smith 2ndQuadrant US Baltimore, MD
> PostgreSQL Training, Services and Support
> greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Carey 2010-08-11 23:10:29 Re: Completely un-tuned Postgresql benchmark results: SSD vs desktop HDD
Previous Message Bruce Momjian 2010-08-11 22:31:26 Re: Testing Sandforce SSD