Re: SSD + RAID

From: Dave Crooke <dcrooke(at)gmail(dot)com>
To: david(at)lang(dot)hm
Cc: Aidan Van Dyk <aidan(at)highrise(dot)ca>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: SSD + RAID
Date: 2010-02-24 08:32:40
Message-ID: ca24673e1002240032w3a44c753m49e8c6986cd6cd45@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

It's always possible to rebuild into a consistent configuration by assigning
a precedence order; for parity RAID, the data drives take precedence over
parity drives, and for RAID-1 sets it assigns an arbitrary master.

You *should* never lose a whole stripe ... for example, RAID-5 updates do
"read old data / parity, write new data, write new parity" ... there is no
need to touch any other data disks, so they will be preserved through the
rebuild. Similarly, if only one block is being updated there is no need to
update the entire stripe.

David - what caused /dev/md to decide to take an array offline?

Cheers
Dave

On Tue, Feb 23, 2010 at 3:22 PM, <david(at)lang(dot)hm> wrote:

> On Tue, 23 Feb 2010, Aidan Van Dyk wrote:
>
> * david(at)lang(dot)hm <david(at)lang(dot)hm> [100223 15:05]:
>>
>> However, one thing that you do not get protection against with software
>>> raid is the potential for the writes to hit some drives but not others.
>>> If this happens the software raid cannot know what the correct contents
>>> of the raid stripe are, and so you could loose everything in that stripe
>>> (including contents of other files that are not being modified that
>>> happened to be in the wrong place on the array)
>>>
>>
>> That's for stripe-based raid. Mirror sets like raid-1 should give you
>> either the old data, or the new data, both acceptable responses since
>> the fsync/barreir hasn't "completed".
>>
>> Or have I missed another subtle interaction?
>>
>
> one problem is that when the system comes back up and attempts to check the
> raid array, it is not going to know which drive has valid data. I don't know
> exactly what it does in that situation, but this type of error in other
> conditions causes the system to take the array offline.
>
>
> David Lang
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Kempter 2010-02-24 14:36:36 partitioned tables query not using indexes
Previous Message Scott Carey 2010-02-23 23:51:43 Re: Planner question - "bit" data types