Re: Dell Hardware Recommendations

From: "Joe Uhl" <joeuhl(at)gmail(dot)com>
To: "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Dell Hardware Recommendations
Date: 2007-08-10 00:54:20
Message-ID: 1186707260.14699.1204649529@webmail.messagingengine.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

Thanks for the input. Thus far we have used Dell but I would certainly
be willing to explore other options.

I found a "Reference Guide" for the MD1000 from April, 2006 that
includes info on the PERC 5/E at:

http://www.dell.com/downloads/global/products/pvaul/en/pvaul_md1000_solutions_guide.pdf

To answer the questions below:

> How many users do you expect to hit the db at the same time?
There are 2 types of users. For roughly every 5000 active accounts, 10
or fewer or those will have additional privileges. Only those more
privileged users interact substantially with the OLAP portion of the
database. For 1 state 10 concurrent connections was about the max, so
if that holds for 50 states we are looking at 500 concurrent users as a
top end, with a very small fraction of those users interacting with the
OLAP portion.

> How big of a dataset will each one be grabbing at the same time?
For the OLTP data it is mostly single object reads and writes and
generally touches only a few tables at a time.

> Will your Perc RAID controller have a battery backed cache on board?
> If so (and it better!) how big of a cache can it hold?
According to the above link, it has a 256 MB cache that is battery
backed.

> Can you split this out onto two different machines, one for the OLAP
> load and the other for what I'm assuming is OLTP?
> Can you physically partition this out by state if need be?
Right now this system isn't in production so we can explore any option.
We are looking into splitting the OLAP and OLTP portions right now and I
imagine physically splitting the partitions on the big OLAP table is an
option as well.

Really appreciate all of the advice. Before we pull the trigger on
hardware we probably will get some external advice from someone but I
knew this list would provide some excellent ideas and feedback to get us
started.

Joe Uhl
joeuhl(at)gmail(dot)com

On Thu, 9 Aug 2007 16:02:49 -0500, "Scott Marlowe"
<scott(dot)marlowe(at)gmail(dot)com> said:
> On 8/9/07, Joe Uhl <joeuhl(at)gmail(dot)com> wrote:
> > We have a 30 GB database (according to pg_database_size) running nicely
> > on a single Dell PowerEdge 2850 right now. This represents data
> > specific to 1 US state. We are in the process of planning a deployment
> > that will service all 50 US states.
> >
> > If 30 GB is an accurate number per state that means the database size is
> > about to explode to 1.5 TB. About 1 TB of this amount would be OLAP
> > data that is heavy-read but only updated or inserted in batch. It is
> > also largely isolated to a single table partitioned on state. This
> > portion of the data will grow very slowly after the initial loading.
> >
> > The remaining 500 GB has frequent individual writes performed against
> > it. 500 GB is a high estimate and it will probably start out closer to
> > 100 GB and grow steadily up to and past 500 GB.
> >
> > I am trying to figure out an appropriate hardware configuration for such
> > a database. Currently I am considering the following:
> >
> > PowerEdge 1950 paired with a PowerVault MD1000
> > 2 x Quad Core Xeon E5310
> > 16 GB 667MHz RAM (4 x 4GB leaving room to expand if we need to)
> > PERC 5/E Raid Adapter
> > 2 x 146 GB SAS in Raid 1 for OS + logs.
> > A bunch of disks in the MD1000 configured in Raid 10 for Postgres data.
> >
> > The MD1000 holds 15 disks, so 14 disks + a hot spare is the max. With
> > 12 250GB SATA drives to cover the 1.5TB we would be able add another
> > 250GB of usable space for future growth before needing to get a bigger
> > set of disks. 500GB drives would leave alot more room and could allow
> > us to run the MD1000 in split mode and use its remaining disks for other
> > purposes in the mean time. I would greatly appreciate any feedback with
> > respect to drive count vs. drive size and SATA vs. SCSI/SAS. The price
> > difference makes SATA awfully appealing.
> >
> > We plan to involve outside help in getting this database tuned and
> > configured, but want to get some hardware ballparks in order to get
> > quotes and potentially request a trial unit.
> >
> > Any thoughts or recommendations? We are running openSUSE 10.2 with
> > kernel 2.6.18.2-34.
>
> Some questions:
>
> How many users do you expect to hit the db at the same time?
> How big of a dataset will each one be grabbing at the same time?
> Will your Perc RAID controller have a battery backed cache on board?
> If so (and it better!) how big of a cache can it hold?
> Can you split this out onto two different machines, one for the OLAP
> load and the other for what I'm assuming is OLTP?
> Can you physically partition this out by state if need be?
>
> A few comments:
>
> I'd go with the bigger drives. Just as many, so you have spare
> storage as you need it. you never know when you'll need to migrate
> your whole data set from one pg db to another for testing etc...
> extra space comes in REAL handy when things aren't quite going right.
> With 10krpm 500 and 750 Gig drives you can use smaller partitions on
> the bigger drives to short stroke them and often outrun supposedly
> faster drives.
>
> The difference between SAS and SATA drives is MUCH less important than
> the difference between one RAID controller and the next. It's not
> likely the Dell is gonna come with the fastest RAID controllers
> around, as they seem to still be selling Adaptec (buggy and
> unreliable, avoid like the plague) and LSI (stable, moderately fast).
>
> I.e. I'd rather have 24 SATA disks plugged into a couple of big Areca
> or 3ware (now escalade I think?) controllers than 8 SAS drives plugged
> into any Adaptec controller.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Steve Madsen 2007-08-10 01:14:55 Re: Interpreting statistics collector output
Previous Message Tom Lane 2007-08-10 00:15:12 Re: UPDATES hang every 5 minutes

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2007-08-10 01:57:32 Re: [PERFORM] Dell Hardware Recommendations
Previous Message Arjen van der Meijden 2007-08-09 22:21:01 Re: Dell Hardware Recommendations