Re: Hardware/OS recommendations for large databases

From: Ron <rjpeace(at)earthlink(dot)net>
To: "Luke Lonergan" <LLonergan(at)greenplum(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Hardware/OS recommendations for large databases
Date: 2005-11-27 17:10:44
Message-ID: 6.2.5.6.0.20051127114155.01dbf868@earthlink.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

At 01:18 AM 11/27/2005, Luke Lonergan wrote:
>For data warehousing its pretty well open and shut. To use all cpus
>and io channels on each query you will need mpp.
>
>Has anyone done the math.on the original post? 5TB takes how long
>to scan once? If you want to wait less than a couple of days just
>for a seq scan, you'd better be in the multi-gb per second range.
More than a bit of hyperbole there Luke.

Some common RW scenarios:
Dual 1GbE NICs => 200MBps => 5TB in 5x10^12/2x10^8= 25000secs=
~6hrs57mins. Network stuff like re-transmits of dropped packets can
increase this, so network SLA's are critical.

Dual 10GbE NICs => ~1.6GBps (10GbE NICs can't yet do over ~800MBps
apiece) => 5x10^12/1.6x10^9= 3125secs= ~52mins. SLA's are even
moire critical here.

If you are pushing 5TB around on a regular basis, you are not wasting
your time & money on commodity <= 300MBps RAID HW. You'll be using
800MBps and 1600MBps high end stuff, which means you'll need ~1-2hrs
to sequentially scan 5TB on physical media.

Clever use of RAM can get a 5TB sequential scan down to ~17mins.

Yes, it's a lot of data. But sequential scan times should be in the
mins or low single digit hours, not days. Particularly if you use
RAM to maximum advantage.

Ron

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Luke Lonergan 2005-11-27 19:11:53 Re: Hardware/OS recommendations for large databases
Previous Message Stephan Szabo 2005-11-27 15:48:01 Re: Hardware/OS recommendations for large databases (