Re: Scaling with memory & disk planning

From: terry(at)greatgulfhomes(dot)com
To: <kgunders(at)cbnlottery(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Scaling with memory & disk planning
Date: 2002-05-30 17:46:38
Message-ID: 001d01c20801$f74cf300$2766f30a@development.greatgulfhomes.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Your RAID analysis is a bit wrong.
In striping (disk joining) every byte written requires 1 byte sent to 1
disk. This gives you ZERO redundancy: RAID0 is used purely for making a
large partition from smaller disks.

In RAID1 (mirroring) Every 1 byte written requires 1 byte written to EACH of
the 2 mirrored disks, for total disk IO of 2bytes.

In RAID5, the most efficient solution, every 1 byte written requires LESS
then 1 byte written for the CRC. Roughly (depending on implementation,
number of disks) every 3 bytes written requires 4 bytes of disk IO.

RAID5 is the fastest from an algorithm, standpoint. There is some gotchas,
RAID5 implemented by hardware is faster the RAID5 implemented by OS, simply
because the controller on the SCSI card acts like a parallel processor.

RAID5 also wastes the least amount of disk space.

What is the cheapest is a relative thing, what is certain is that RAID 5
requires more disks (at least 3) then mirroring (exactly 2), but RAID5
wastes less space, so the cost analysis begins with a big "it depends...".

Any disk system will choke under heavy load, especially if the disk write
system is inefficient (like IBM's IDE interface). I think if you did a
test, you would find RAID1 would choke more then RAID5 simply because RAID1
requires MORE disk IO for the same bytes being saved.

Referring to what Tom Lane said, he recommends 7 drive RAID5 for a very good
reason: The more the drives, the faster the performance. Here's why:
Write 7 bytes on a 7 drive RAID5, the first byte goes to drive 1, 2nd byte
to drive 2, etc, and the CRC to the final drive. For high performance SCSI
systems, whose BUS IO is faster then drives (and most SCSI IO chains ARE
faster then the drives they are attached to) the drives actually write in
PARALLEL. I can give you a more detailed example, but suffice to say that
with RAID5 writing 7 bytes to 7 data drives takes about the same time to
write 3 or 4 bytes to a single non raid drive. That my friends, is why
RAID5 (especially when done by hardware) actually improves performance.

Terry Fielder
Network Engineer
Great Gulf Homes / Ashton Woods Homes
terry(at)greatgulfhomes(dot)com

> -----Original Message-----
> From: pgsql-general-owner(at)postgresql(dot)org
> [mailto:pgsql-general-owner(at)postgresql(dot)org]On Behalf Of Kurt Gunderson
> Sent: Thursday, May 30, 2002 12:59 PM
> Cc: pgsql-general(at)postgresql(dot)org
> Subject: Re: [GENERAL] Scaling with memory & disk planning
>
>
> Bear in mind that I am a newbie to the PostgreSQL world but have
> experience in other RDBMSs when I ask this question:
>
> If you are looking for the best performance, why go with a RAID5 as
> opposed to a RAID1+0 (mirrored stripes) solution?
> Understandably RAID5
> is a cheaper solution requiring fewer drives for redundancy
> but, from my
> experience, RAID5 chokes horribly under heavy disk writing. RAID5
> always requires at least two write operations for every block
> written;
> one to the data and one to the redundancy algorithm.
>
> Is this wrong?
>
> (I mean no disrespect)
>
> Tom Lane wrote:
>
> > Doug Fields <dfields-pg-general(at)pexicom(dot)com> writes:
> >
> >>d) How much extra performance does having the log or
> indices on a different
> >>disk buy you, esp. in the instance where you are inserting
> millions of
> >>records into a table? An indexed table?
> >>
> >
> > Keeping the logs on a separate drive is a big win, I
> believe, for heavy
> > update situations. (For read-only queries, of course the
> log doesn't
> > matter.)
> >
> > Keeping indexes on a separate drive is also traditional
> database advice,
> > but I don't have any feeling for how much it matters in Postgres.
> >
> >
> >>a) Run everything on one 7-drive RAID 5 partition (8th
> drive as hot spare)
> >>b) Run logs as a 2-drive mirror and the rest on a 5-drive RAID 5
> >>c) Run logs on a 2-drive mirror, indices on a 2-drive
> mirror, and the rest
> >>on a 3-drive RAID5?
> >>d) Run logs & indices on a 2-drive mirror and the rest on a
> 5-drive RAID 5
> >>
> >
> > You could probably get away without mirroring the indices,
> if you are
> > willing to incur a little downtime to rebuild them after an
> index drive
> > failure. So another possibility is
> >
> > 2-drive mirror for log, 1 plain old drive for indexes, rest
> for data.
> >
> > If your data will fit on 2 drives then you could mirror
> both, still have
> > your 8th drive as hot spare, and feel pretty secure.
> >
> > Note that while it is reasonably painless to configure PG
> with WAL logs
> > in a special place (after initdb, move the pg_xlog
> subdirectory and make
> > a symlink to its new location), it's not currently easy to separate
> > indexes from data. So the most practical approach in the
> short term is
> > probably your (b).
> >
> > regards, tom lane
> >
> > ---------------------------(end of
> broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> > http://archives.postgresql.org
> >
> >
>
>
> --
> Kurt Gunderson
> Senior Programmer
> Applications Development
> Lottery Group
> Canadian Bank Note Company, Limited
> Email: kgunders(at)cbnlottery(dot)com
> Phone:
> 613.225.6566 x326
> Fax:
> 613.225.6651
> http://www.cbnco.com/
>
> "Entropy isn't what is used to be"
>
> Obtaining any information from this message for the purpose of sending
> unsolicited commercial Email is strictly prohibited. Receiving this
> email does not constitute a request of or consent to send unsolicited
> commercial Email.
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Steve Wolfe 2002-05-30 17:50:53 Re: Scaling with memory & disk planning (was Re: Non-linear Performance)
Previous Message mpls 2002-05-30 17:44:49 Re: connection refused problem