Re: Hardware for a database server

From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: Erwin Brandstetter <a9006241(at)unet(dot)univie(dot)ac(dot)at>
Cc: <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Hardware for a database server
Date: 2004-03-10 17:12:27
Message-ID: Pine.LNX.4.33.0403100954270.10418-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Wed, 10 Mar 2004, Erwin Brandstetter wrote:

> Hi List!

Howdy!

> Short version:
> It is for a medium sized database. ~ 50 users, < 5GB, biggest table < 1
> million tupples, 60 tables, lots of indices, triggers, rules, and other
> objects.

SNIP!

> What will I purchase?
>
> CPU:
> Single AMD Opteron.
> Opteron, because i plan to migrate to amd64 debian as soon as Debian
> has a stable release.
> Single, because multi-CPU would only make sense if the CPU could ever
> get the bottleneck. But I don't have to fear that, right? No need for a
> dual-cpu setup?
> Should I consider Athlon 64 FX or Athlon 64? I guess socket 940 has
> more future than socket 754, right?

I would recommend going with a dual CPU machine. while having lotsa CPUs
isn't like to be necessary for what you're doing, dual CPUs are not quite
the same thing. In most cases having dual CPUs allows the OS to run on
one CPU while the database runs on the other, so to speak. I.e. there's
enough going on to use a second processor a fair bit. After that, it
really depends on how CPU intensive your database use is.

Generally I've found dual CPU machines to be more responsive than single
CPU machines under heavy load.

> Controller / Hard Discs:
> RAID 5 with 4+ discs including a hot spare. But SCSI or SATA?
> I am undecided on this. Until about a year ago, I would have said SCSI,
> period. But I have read of SATA RAIDs for entry-level-servers doing
> quite well and Linux dealing with it ever more smoothly.

Standard IDE drives have an issue that all the ones I've tested so far,
and presumably, most of the rest lie about fsync, and therefore are not
guaranteed to have a coherent database on them should you lose power
during a transaction. If SATA drives in fact have proper fsyncing with
write caching, then they're a great choice. You might want to test one or
two before commiting to a rack of them.

SCSI drives seem to pass with flying colors. With a battery backed
caching RAID controller you can get very good results.

RAID5 with a small number of drives is fine for good read performance, but
not as good under a heavily written environment like RAID 1+0 is. For >6
drives, RAID5 starts to catch up in a heavily written environment as the
number of platters available to spread the writes out on goes up.

That said, we get great performance from RAID1 on our LSI megaraid, nearly
as good as a 5 drive RAID5 for reads, and better for writes.

> ([1], [2])
> So I wonder if it is still a good decission to spend 3 times the money
> per gigabyte on SCSI?

Only if your data is important. In this instance, it sounds like most of
what you're holding is coming from other sources, and losing a days worth
of data is no big deal, since you can get it back. If that's the case,
back up every night, turn off fsync, and run on a rack of IDE or SATA
drives, whichever are cheaper per meg, and spend your money on memory for
the server.

> And do 3ware Controllers still have the best driver support under
> Linux?

LSI/Megaraid's megaraid2 driver is very fast and very stable. the adaptec
drive seems to be working pretty well nowadays as well. Installation on
our Dell boxes with the adaptec were much more difficult than the
megaraid2 driver, which uses dkms (dynamic kernel module system) which is
a very cool system. you install the dkms rpm, then when you install a
source rpm for a drive, the dkms package kicks in, compiles it, puts it in
the right directory, and you're ready to use it.

I've not used the 3ware controller before.

> Any harddisks known to be especially apt for databases (hi I/O load
> ..)?

I really like the performance my Seagate Barracuda 36 gig 10krpm drives
give me. If it's mostly read, just throw as many drives at it as you can
on fast busses. Aggregate bandwidth is almost always the real key to fast
performance.

> Power supply:
> Secured with UPS, auto-shutdown before power fails, so do I need my
> RAID controller battery-backed still?

Yep. Power supplies fail, motherboards fry and take out the power rail
every so often. Idiots trip over power cords. hehe. been there, done
that, got the TShirt.

> RAM:
> As much as the motherboard will bear. 4 GB probably. This seems the
> easyest point to decide on. Correct? DDR SDRAM PC333 or PC400?

If you're looking at 64 bit machines, most of those can hold >4 gig, at
least 8 gig nowadays. Don't buy tons today, but do buy enough to
interleave access if you have >1 CPU. Opterons can interleave access, and
theoretically each CPU could get max memory bandwidth if you have enough
banks to allow interleaving. So it's theoretically possible for an
SMP machine with 8 sticks totalling 1 gig could outrun the same machine
with 2 sticks totalling 2 gigs, since there'd be a 75% chance that the
second CPU accessing memory would not be in contention with the first CPU.

If you're looking at 32 bit machines, stick to 2 gigs unless you will
regularly be accessing more than that, as going beyond 2 gigs invokes a 10
to 15% performance hit due to the goofy switching schema used there.
Spend the money on more hard drives.

Look at the performance tuning article on varlena:

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Frederic Medery 2004-03-10 18:15:37 Re: migration from postgresql-7.2. to 7.4.1 : invalid command
Previous Message Tom Lane 2004-03-10 16:53:51 Re: syslog slowing the database?