Re: Best hardware/cost tradoff?

From: "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
To: cluster <skrald(at)amossen(dot)dk>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Best hardware/cost tradoff?
Date: 2008-08-28 19:46:01
Message-ID: dcc563d10808281246q749127dfh44475116cce7f61d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Aug 28, 2008 at 1:22 PM, cluster <skrald(at)amossen(dot)dk> wrote:
> I'm about to buy a combined web- and database server. When (if) the site
> gets sufficiently popular, we will split the database out to a separate
> server.
>
> Our budget is limited, so how should we prioritize?

Standard prioritization for a db server is: Disks and controller, RAM, CPU.

> * We think about buying some HP Proliant server with at least 4GB ram and at
> least a duo core processor. Possibly quad core. The OS will be debian/Linux.

HP Makes nice equipment. Also, since this machine will have apache as
well as pgsql running on it, you might want to look at more memory if
it's reasonably priced. If pg and apache are using 1.5Gig total to
run, you've got 2.5Gig for the OS to cache in. With 8 Gig of ram,
you'd have 6.5Gig to cache in. Also, the cost of a quad core nowadays
is pretty reasonable.

> * Much of the database will fit in RAM so it is not *that* necessary to
> prefer the more expensive SAS 10000 RPM drives to the cheaper 7500 RPM SATA
> drives, is it?

That depends. Writes will still have to hit the drives. Reads will
be mostly from memory. Be sure to set your effective_cache_size
appropriately.

> There will both be many read- and write queries and a *lot*
> (!) of random reads.
>
> * I think we will go for hardware-based RAID 1 with a good battery-backed-up
> controller.

The HP RAID controller that's been mentioned on the list seems like a
good performer.

> I have read that software RAID perform surprisingly good, but
> for a production site where hotplug replacement of dead disks is required,
> is software RAID still worth it?

The answre is maybe. The reason people keep testing software RAID is
that a lot of cheap (not necessarily in cost, just in design)
controllers give mediocre performance compared to SW RAID.

With SW RAID on top of a caching controller in jbod mode, the
controller simply becomes a cache that can survive power loss, and
doesn't have to do any RAID calculations any more. With today's very
fast CPUs, and often running RAID-10 for dbs, which requires little
real overhead, it's not uncommon for SW RAID to outrun HW.

With better controllers, the advantage is small to none.

> Anything else we should be aware of?

Can you go with 4 drives? Even if they're just SATA drives, you'd be
amazed at what going from a 2 drive mirror to a 4 drive RAID-10 can do
for your performance. Note you'll have no more storage going from 2
drive mirror to 4 drive RAID-10, but your aggregate bandwidth on reads
will be doubled.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Matthew Wakeling 2008-08-28 20:04:23 Re: select on 22 GB table causes "An I/O error occured while sending to the backend." exception
Previous Message Emi Lu 2008-08-28 19:31:04 update - which way quicker?