Re: Scaling with memory & disk planning

From: Jean-Luc Lachance <jllachan(at)nsd(dot)ca>
To: terry(at)greatgulfhomes(dot)com
Cc: kgunders(at)cbnlottery(dot)com, pgsql-general(at)postgresql(dot)org
Subject: Re: Scaling with memory & disk planning
Date: 2002-05-30 19:16:30
Message-ID: 3CF67A8E.C9A93ABF@nsd.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I think your undestanding of RAID 5 is wrong also.

For a general N disk RAID 5 the process is:
1)Read sector
2)XOR with data to write
3)Read parity sector
4)XOR with result above
5)write data
6)write parity

So you can see, for every logical write, there is two reads and two
writes.

For a 3 disks RAID 5 the process can be shortened:
1)Write data
2)Read other disk
3)XOR with data
4)Write to parity disk.

So, two writes and one read.

JLL

terry(at)greatgulfhomes(dot)com wrote:
>
> Your RAID analysis is a bit wrong.
> In striping (disk joining) every byte written requires 1 byte sent to 1
> disk. This gives you ZERO redundancy: RAID0 is used purely for making a
> large partition from smaller disks.
>
> In RAID1 (mirroring) Every 1 byte written requires 1 byte written to EACH of
> the 2 mirrored disks, for total disk IO of 2bytes.

> In RAID5, the most efficient solution, every 1 byte written requires LESS
> then 1 byte written for the CRC. Roughly (depending on implementation,
> number of disks) every 3 bytes written requires 4 bytes of disk IO.
>
> RAID5 is the fastest from an algorithm, standpoint. There is some gotchas,
> RAID5 implemented by hardware is faster the RAID5 implemented by OS, simply
> because the controller on the SCSI card acts like a parallel processor.
>
> RAID5 also wastes the least amount of disk space.
>
> What is the cheapest is a relative thing, what is certain is that RAID 5
> requires more disks (at least 3) then mirroring (exactly 2), but RAID5
> wastes less space, so the cost analysis begins with a big "it depends...".
>
> Any disk system will choke under heavy load, especially if the disk write
> system is inefficient (like IBM's IDE interface). I think if you did a
> test, you would find RAID1 would choke more then RAID5 simply because RAID1
> requires MORE disk IO for the same bytes being saved.
>
> Referring to what Tom Lane said, he recommends 7 drive RAID5 for a very good
> reason: The more the drives, the faster the performance. Here's why:
> Write 7 bytes on a 7 drive RAID5, the first byte goes to drive 1, 2nd byte
> to drive 2, etc, and the CRC to the final drive. For high performance SCSI
> systems, whose BUS IO is faster then drives (and most SCSI IO chains ARE
> faster then the drives they are attached to) the drives actually write in
> PARALLEL. I can give you a more detailed example, but suffice to say that
> with RAID5 writing 7 bytes to 7 data drives takes about the same time to
> write 3 or 4 bytes to a single non raid drive. That my friends, is why
> RAID5 (especially when done by hardware) actually improves performance.
>
> Terry Fielder
> Network Engineer
> Great Gulf Homes / Ashton Woods Homes
> terry(at)greatgulfhomes(dot)com
>
> > -----Original Message-----
> > From: pgsql-general-owner(at)postgresql(dot)org
> > [mailto:pgsql-general-owner(at)postgresql(dot)org]On Behalf Of Kurt Gunderson
> > Sent: Thursday, May 30, 2002 12:59 PM
> > Cc: pgsql-general(at)postgresql(dot)org
> > Subject: Re: [GENERAL] Scaling with memory & disk planning
> >
> >
> > Bear in mind that I am a newbie to the PostgreSQL world but have
> > experience in other RDBMSs when I ask this question:
> >
> > If you are looking for the best performance, why go with a RAID5 as
> > opposed to a RAID1+0 (mirrored stripes) solution?
> > Understandably RAID5
> > is a cheaper solution requiring fewer drives for redundancy
> > but, from my
> > experience, RAID5 chokes horribly under heavy disk writing. RAID5
> > always requires at least two write operations for every block
> > written;
> > one to the data and one to the redundancy algorithm.
> >
> > Is this wrong?
> >
> > (I mean no disrespect)
> >
> > Tom Lane wrote:
> >
> > > Doug Fields <dfields-pg-general(at)pexicom(dot)com> writes:
> > >
> > >>d) How much extra performance does having the log or
> > indices on a different
> > >>disk buy you, esp. in the instance where you are inserting
> > millions of
> > >>records into a table? An indexed table?
> > >>
> > >
> > > Keeping the logs on a separate drive is a big win, I
> > believe, for heavy
> > > update situations. (For read-only queries, of course the
> > log doesn't
> > > matter.)
> > >
> > > Keeping indexes on a separate drive is also traditional
> > database advice,
> > > but I don't have any feeling for how much it matters in Postgres.
> > >
> > >
> > >>a) Run everything on one 7-drive RAID 5 partition (8th
> > drive as hot spare)
> > >>b) Run logs as a 2-drive mirror and the rest on a 5-drive RAID 5
> > >>c) Run logs on a 2-drive mirror, indices on a 2-drive
> > mirror, and the rest
> > >>on a 3-drive RAID5?
> > >>d) Run logs & indices on a 2-drive mirror and the rest on a
> > 5-drive RAID 5
> > >>
> > >
> > > You could probably get away without mirroring the indices,
> > if you are
> > > willing to incur a little downtime to rebuild them after an
> > index drive
> > > failure. So another possibility is
> > >
> > > 2-drive mirror for log, 1 plain old drive for indexes, rest
> > for data.
> > >
> > > If your data will fit on 2 drives then you could mirror
> > both, still have
> > > your 8th drive as hot spare, and feel pretty secure.
> > >
> > > Note that while it is reasonably painless to configure PG
> > with WAL logs
> > > in a special place (after initdb, move the pg_xlog
> > subdirectory and make
> > > a symlink to its new location), it's not currently easy to separate
> > > indexes from data. So the most practical approach in the
> > short term is
> > > probably your (b).
> > >
> > > regards, tom lane
> > >
> > > ---------------------------(end of
> > broadcast)---------------------------
> > > TIP 6: Have you searched our list archives?
> > >
> > > http://archives.postgresql.org
> > >
> > >
> >
> >
> > --
> > Kurt Gunderson
> > Senior Programmer
> > Applications Development
> > Lottery Group
> > Canadian Bank Note Company, Limited
> > Email: kgunders(at)cbnlottery(dot)com
> > Phone:
> > 613.225.6566 x326
> > Fax:
> > 613.225.6651
> > http://www.cbnco.com/
> >
> > "Entropy isn't what is used to be"
> >
> > Obtaining any information from this message for the purpose of sending
> > unsolicited commercial Email is strictly prohibited. Receiving this
> > email does not constitute a request of or consent to send unsolicited
> > commercial Email.
> >
> >
> > ---------------------------(end of
> > broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> > http://archives.postgresql.org
> >
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message terry 2002-05-30 19:31:33 Re: Scaling with memory & disk planning
Previous Message Tom Lane 2002-05-30 19:12:06 Re: horrendous query challenge :-)