Re: Hardware/OS recommendations for large databases (

From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Scott Marlowe" <smarlowe(at)g2switchworks(dot)com>, "William Yu" <wyu(at)talisys(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Hardware/OS recommendations for large databases (
Date: 2005-11-16 17:47:26
Message-ID: BFA0B0AE.13D4E%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Scott,

On 11/16/05 9:09 AM, "Scott Marlowe" <smarlowe(at)g2switchworks(dot)com> wrote:

> The biggest gain is going from 1 to 2 CPUs (real cpus, like the DC
> Opterons or genuine dual CPU mobo, not "hyperthreaded"). Part of the
> issue isn't just raw CPU processing power. The second CPU allows the
> machine to be more responsive because it doesn't have to context switch
> as much.
>
> While I've seen plenty of single CPU servers start to bog under load
> running one big query, the dual CPU machines always seem more than just
> twice as snappy under similar loads.
>
I agree, 2 CPUs are better than one in most cases.

The discussion was kicked off by the suggestion to get 8 dual core CPUs to
process a large database with postgres. Say your decision support query
takes 15 minutes to run with one CPU. Add another and it still takes 15
minutes. Add 15 and the same ...

OLTP is so different from Business intelligence and Decision Support that
very little of this thread¹s discussion is relevant IMO.

The job is to design a system that can process sequential scan as fast as
possible and uses all resources (CPUs, mem, disk channels) on each query.
Sequential scan is 100x more important than random seeks.

Here are the facts so far:
* Postgres can only use 1 CPU on each query
* Postgres I/O for sequential scan is CPU limited to 110-120 MB/s on the
fastest modern CPUs
* Postgres disk-based sort speed is 1/10 or more slower than commercial
databases and memory doesn¹t improve it (much)

These are the conclusions that follow about decision support / BI system
architecture for normal Postgres:
* I/O hardware with more than 110MB/s of read bandwidth is not useful
* More than 1 CPU is not useful
* More RAM than a nominal amount for small table caching is not useful

In other words, big SMP doesn¹t address the problem at all. By contrast,
having all CPUs on multiple machines, or even on a big SMP with lots of I/O
channels, solves all of the above issues.

Regards,

- Luke

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Luke Lonergan 2005-11-16 17:49:28 Re: Hardware/OS recommendations for large databases (
Previous Message Scott Marlowe 2005-11-16 17:09:38 Re: Hardware/OS recommendations for large databases (