Re: Hardware recommendations

From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Benjamin Krajmalnik <kraj(at)servoyant(dot)com>
Cc: John W Strange <john(dot)w(dot)strange(at)jpmchase(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Hardware recommendations
Date: 2010-12-09 02:28:38
Message-ID: AANLkTi=tvN=1YSR7u_qocL83SHVjT9iXmzJF-nQapQvg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Dec 8, 2010 at 5:03 PM, Benjamin Krajmalnik <kraj(at)servoyant(dot)com> wrote:
> John,
>
> The platform is a network monitoring system, so we have quite a lot of inserts/updates (every data point has at least one record insert as well as at least 3 record updates).  The management GUI has a lot of selects.  We are refactoring the database to some degree to aid in the performance, since the performance degradations are correlated to the number of users viewing the system GUI.

Scalability here may be better addressed by having something like hot
read only slaves for the users who want to view data.

> My biggest concern with SSD drives is their life expectancy,

Generally that's not a big issue, especially as the SSDs get larger.
Being able to survive a power loss without corruption is more of an
issue, so if you go SSD get ones with a supercapacitor that can write
out the data before power down.

> as well as our need for relatively high capacity.

Ahhh, capacity is where SSDs start to lose out quickly. Cheap 10k SAS
drives and less so 15k drives are way less per gigabyte than SSDs, and
you can only fit so many SSDs onto a single controller / in a single
cage before you're broke.

>  From a purely scalability perspective, this setup will need to support terabytes of data.  I suppose I could use table spaces to use the most accessed data in SSD drives and the rest on regular drives.
> As I stated, I am moving to RAID 10, and was just wondering if the logs should still be moved off to different spindles, or will leaving them on the RAID10 be fine and not affect performance.

With a battery backed caching RAID controller, it's more important
that you have the pg_xlog files on a different partition than on a
differen RAID set. I.e. you can have one big RAID set, and set aside
the first 100G or so for pg_xlog. This has to do with fsync
behaviour. In linux this is a known issue, I'm not sure how much so
it would be in BSD. But you should test for fsync contention.

As for the Areca controllers, I haven't tested them with the latest
drivers or firmware, but we would routinely get 180 to 460 days of
uptime between lockups on our 1680s we installed 2.5 or so years ago.
Of the two brand new LSI 8888 controllers we installed this summer,
we've had one fail already. However, the database didn't get
corrupted so not too bad. My preference still leans towards the
Areca, but no RAID controller is perfect and infallible.

Performance wise the Areca is still faster than the LSI 8888, and the
newer faster LSI just didn't work with out quad 12 core AMD mobo.
Note that all of that hardware was brand new, so things may have
improved by now. I have to say Aberdeen took great care of us in
getting the systems up and running.

As for CPUs, almost any modern CPU will do fine.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Vlad Arkhipov 2010-12-09 03:58:08 Re: Slow BLOBs restoring
Previous Message Robert Haas 2010-12-09 01:41:44 Re: Performance under contention