Re: Huge Data sets, simple queries

From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: "Mike Biamonte" <mike(at)dbeat(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Huge Data sets, simple queries
Date: 2006-01-31 22:52:57
Message-ID: C0052A49.1B62D%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Jim,

On 1/31/06 11:21 AM, "Jim C. Nasby" <jnasby(at)pervasive(dot)com> wrote:

> (BTW, I did some testing that seems to confirm this)
>
> Why couldn't you double the bandwidth? If you're doing a largish read
> you should be able to do something like have drive a read the first
> track, drive b the second, etc. Of course that means that the controller
> or OS would have to be able to stitch things back together.

It's because your alternating reads are skipping in chunks across the
platter. Disks work at their max internal rate when reading sequential
data, and the cache is often built to buffer a track-at-a-time, so
alternating pieces that are not contiguous has the effect of halving the max
internal sustained bandwidth of each drive - the total is equal to one
drive's sustained internal bandwidth.

This works differently for RAID0, where the chunks are allocated to each
drive and laid down contiguously on each, so that when they're read back,
each drive runs at it's sustained sequential throughput.

The alternating technique in mirroring might improve rotational latency for
random seeking - a trick that Tandem exploited, but it won't improve
bandwidth.

> As for software raid, I'm wondering how well that works if you can't use
> a BBU to allow write caching/re-ordering...

Works great with standard OS write caching.

- Luke

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message PFC 2006-01-31 23:11:05 Re: Huge Data sets, simple queries
Previous Message Luke Lonergan 2006-01-31 20:47:10 Re: Huge Data sets, simple queries