Re: setting up raid10 with more than 4 drives

From: PFC <lists(at)peufeu(dot)com>
To: "Luke Lonergan" <LLonergan(at)greenplum(dot)com>, "Michael Stone" <mstone+postgres(at)mathom(dot)us>, pgsql-performance(at)postgresql(dot)org
Subject: Re: setting up raid10 with more than 4 drives
Date: 2007-05-30 15:31:58
Message-ID: op.ts5b3kikcigqcu@apollo13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, 30 May 2007 16:36:48 +0200, Luke Lonergan
<LLonergan(at)greenplum(dot)com> wrote:

>> I don't see how that's better at all; in fact, it reduces to
>> exactly the same problem: given two pieces of data which
>> disagree, which is right?
>
> The one that matches the checksum.

- postgres tells OS "write this block"
- OS sends block to drives A and B
- drive A happens to be lucky and seeks faster, writes data
- student intern carrying pizzas for senior IT staff trips over power
cord*
- boom
- drive B still has old block

Both blocks have correct checksum, so only a version counter/timestamp
could tell.
Fortunately if fsync() is honored correctly (did you check ?) postgres
will zap such errors in recovery.

Smart RAID1 or 0+1 controllers (including software RAID) will distribute
random reads to both disks (but not writes obviously).

* = this happened at my old job, yes they had a very frightening server
room, or more precisely "cave" ; I never went there, I didn't want to be
the one fired for tripping over the wire...

From Linux Software RAID howto :

- benchmarking (quite brief !)
http://unthought.net/Software-RAID.HOWTO/Software-RAID.HOWTO-9.html#ss9.5

- read "Data Scrubbing" here :
http://gentoo-wiki.com/HOWTO_Install_on_Software_RAID

- yeah but does it work ? (scary)
http://bugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=405919

md/sync_action
This can be used to monitor and control the resync/recovery
process of MD. In particular, writing "check" here will cause
the array to read all data block and check that they are
consistent (e.g. parity is correct, or all mirror replicas are
the same). Any discrepancies found are NOT corrected.

A count of problems found will be stored in md/mismatch_count.

Alternately, "repair" can be written which will cause the same
check to be performed, but any errors will be corrected.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message PFC 2007-05-30 15:40:50 Re: setting up raid10 with more than 4 drives
Previous Message Gregory Stark 2007-05-30 15:23:46 Re: setting up raid10 with more than 4 drives