Re: Which hardware ?

From: Jon D <slackmaster22(at)hotmail(dot)com>
To: Lionel <lionel(at)art-informatique(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Which hardware ?
Date: 2008-06-18 01:14:58
Message-ID: BAY102-W12ED7E7A2005665400F9D2A5AB0@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

----------------------------------------
> From: lionel(at)art-informatique(dot)com
> Subject: [PERFORM] Which hardware ?
> Date: Tue, 17 Jun 2008 15:38:59 +0200
> To: pgsql-performance(at)postgresql(dot)org
>
> Hi,
>
> I need to install a 8.3 database and was wondering which hardware would be
> sufficient to have good performances (less than 30s for² slowest select).
>
> Database size: 25 Go /year, 5 years of history
> One main table containing 40 million lines per year.
> Batch inserts of 100000 lines. Very very few deletes, few updates.
>
> 30 other tables, 4 levels of hierarchy, containing from 10 lines up to 20000
> lines.
> 5 of them have forein keys on the main table.
>
> I will use table partitionning on the year column.
>
> Statements will mainly do sums on the main table, grouped by whatever column
> of the database (3-5 joined tables, or join on join), with some criterions
> that may vary, lots of "joined varchar in ('a','b',...,'z')".
> It's almost impossible to predict what users will do via the webapplication
> that queries this database: almost all select, join, group by, where...
> possibilities are available.
>
> Up to 4 simultaneous users.
>
> I'm planning to host it on a quad xeon 2.66Ghz with 8Go of DDR2, and a dual
> (RAID1) SATA2 750Go HD.
> Perharps with another HD for indexes.
>
> Do you think it will be enough ?
> Is another RAID for better performances a minimum requirement ?
> Will a secondary HD for indexes help ?
>
> Which OS would you use ? (knowing that there will be a JDK 1.6 installed
> too)
>
> With 5 millions of lines, the same application runs quite fast on windows
> 2000 on a single P4 2.8 GHz (very few statements last more than 10s, mostly
> when concurrent statements are made). Each statement consumes 100% of the
> CPU.
>
>
> thanks for advices.
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

I think hardware isnt going to solve your problem, especially the cpu. You only have four users.. and postgres can only use 1 core per query. If you have sequential scans that span this table and say it has 60-80 million rows, It can could take longer then 30 seconds. Even if you have alot of ram. Just imagine what postgres is doing... if its target search is going to end in searching 40 million rows and it has to aggregate on two, or three columns its going to be slow. No amount of hardware is going to fix this. Sure you can gain some speed by having entire tables in ram. No magic bullet here. Disk is definitely not a magic bullet. Even if you have a bunch of fast disks its still much slower then RAM in performing reads. So if you read heavy then adding more disk isnt going to just solve all your problems. RAM is nice. The more pages you can keep in ram the less reading from the disk.

Even with that all said and done... aggregating lots of rows takes time. I suggest you come up with a system from preaggregating your data if possible. Identify all of your target dimensions. If your lucky, you only have a few key dimensions which can reduce size of table by lots and reduce queries to 1-2 seconds. There are a number of ways to tackle this, but postgres is a nice db to do this with, since writers do not block readers.

I think you should focus on getting this system to work well with minimal hardware first. Then you can upgrade. Over the next few years the db is only going to get larger. You have 4 users now.. but who's to say what it will evolve into.
_________________________________________________________________
Earn cashback on your purchases with Live Search - the search that pays you back!
http://search.live.com/cashback/?&pkw=form=MIJAAF/publ=HMTGL/crea=earncashback

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Howard Cole 2008-06-18 10:40:11 Re: Tsearch2 Initial Search Speed
Previous Message Greg Smith 2008-06-17 21:15:14 Re: Which hardware ?