From: | Ron <rjpeace(at)earthlink(dot)net> |
---|---|
To: | "Luke Lonergan" <LLonergan(at)greenplum(dot)com>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Hardware/OS recommendations for large databases |
Date: | 2005-11-27 17:10:44 |
Message-ID: | 6.2.5.6.0.20051127114155.01dbf868@earthlink.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
At 01:18 AM 11/27/2005, Luke Lonergan wrote:
>For data warehousing its pretty well open and shut. To use all cpus
>and io channels on each query you will need mpp.
>
>Has anyone done the math.on the original post? 5TB takes how long
>to scan once? If you want to wait less than a couple of days just
>for a seq scan, you'd better be in the multi-gb per second range.
More than a bit of hyperbole there Luke.
Some common RW scenarios:
Dual 1GbE NICs => 200MBps => 5TB in 5x10^12/2x10^8= 25000secs=
~6hrs57mins. Network stuff like re-transmits of dropped packets can
increase this, so network SLA's are critical.
Dual 10GbE NICs => ~1.6GBps (10GbE NICs can't yet do over ~800MBps
apiece) => 5x10^12/1.6x10^9= 3125secs= ~52mins. SLA's are even
moire critical here.
If you are pushing 5TB around on a regular basis, you are not wasting
your time & money on commodity <= 300MBps RAID HW. You'll be using
800MBps and 1600MBps high end stuff, which means you'll need ~1-2hrs
to sequentially scan 5TB on physical media.
Clever use of RAM can get a 5TB sequential scan down to ~17mins.
Yes, it's a lot of data. But sequential scan times should be in the
mins or low single digit hours, not days. Particularly if you use
RAM to maximum advantage.
Ron
From | Date | Subject | |
---|---|---|---|
Next Message | Luke Lonergan | 2005-11-27 19:11:53 | Re: Hardware/OS recommendations for large databases |
Previous Message | Stephan Szabo | 2005-11-27 15:48:01 | Re: Hardware/OS recommendations for large databases ( |