Re: Performance and Clustering

From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Jaime Rodriguez <jaime(dot)rodriguez(at)liberux(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Performance and Clustering
Date: 2010-04-29 19:41:10
Message-ID: t2gdcc563d11004291241l3ee108a5yd40e3718440416ea@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Apr 28, 2010 at 7:08 PM, Jaime Rodriguez
<jaime(dot)rodriguez(at)liberux(dot)com> wrote:
> hi,
> Today is my first day looking at PostgreSQL
> I am looking to migrate a MS SQL DB to PostgreSQL :) :)
> My customer requires that DBMS shall support 4000 simultaneous requests
> Also the system to be deploy maybe a cluster, with 12 microprocessors

I'm gonna jump in here and say that if you 400 REQUESTS running at the
same time, you're gonna want a REALLY big machine.

I admin a setup where two db servers handle ~200 simultaneous
requests, almost all being very short millisecond long requests, and a
few being 100 milliseconds, and a very very few running for seconds.

With 8 2.1 GHz Opteron cores, 32 Gigs of ram, and 14x15k drives those
machines run with a load factor in the range of 10 to 15. CPUs are
maxed at that range of load, and IO is 70 to 80% utilized acording to
iostat -x. Wait % is generally one core max. Some of that load is
fixed on the master, but a lot can be handled by slaves.

Your load, if you really are having 4000 simultaneous connections, is
likely going to need 20 times the load handling I need. Given the
newer 12 core AMDs are somewhat faster, you could probably get away
with two or three of these machines. If you were to use 96 core
machines (8Px12core) with as many disks as you could throw at them (40
to 100) then you're in the ballpark for a set of machines to process
4,000 simultaneous requests, assuming a mostly read (80% or so) setup.
We're talking a large % of a full sized rack to hold all the drives
and cores you'd need.

But this brings up a lot of questions about partitioning your dataset
if you can, things like that. Do all of these 4,000 simultaneous
requests need to update the same exact data set? Or are they read
mostly reporting users? Can you use memcached to handle part of the
load? Usage patterns informs a great deal on how to size a system to
handle that much load.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Scott Marlowe 2010-04-29 19:41:47 Re: Performance and Clustering
Previous Message Scott Ribe 2010-04-29 19:38:39 Re: Storing many big files in database- should I do it?