Re: Hardware needed for 15,000,000 record DB?

From: Jeremy Buchmann <jeremy(at)wellsgaming(dot)com>
To: pdg(at)stratos(dot)net
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Hardware needed for 15,000,000 record DB?
Date: 2002-04-22 22:17:26
Message-ID: 3CC48BF6.8080006@wellsgaming.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-novice

pdg(at)stratos(dot)net wrote:
> Hello, :-)
>
> Can someone help me spec out the hardware needed for a simple web-based
> database app?
>
> Basically, that application needs to lookup a single row by its primary key.
> This would be fairly straightforward, except that the table needs to contain
> up to 15 million records. Each row will contain approximately 25 variable
> length CHAR fields with perhaps a total of 3 to 4 kilobytes of data per row.
>
> Updates will be done nightly via some sort of a batch process.
>
> What kind of hardware would be needed for this sort of application? The
> queries are not complex, it's just a lot of data.

Databases are I/O bound...if your I/O is slow, your database will be slow.
So your goal is to minimize the amount of I/O needed and the time it takes
to do the I/O. You minimize the amount of I/O by getting things with big
caches. You minimize the amount of time the I/O takes by using fast
storage devices.

This means SCSI. If you have limited funds, spend them on fast SCSI disks
with big caches and a good controller instead the latest and greatest
processor.

If you have the funds, look into getting a RAID card with a big cache on it.
A RAID 1 or 5 also helps out if a disk crashes.

> Would a midrange Celeron processor with 256 MB RAM be sufficient? How would
> backups work for a database this large?

I would not use the Celeron. The Celeron lacks cache, and you want all the
cache you can afford. One user posted benchmarks to this list showing the
Celeron to be twice as slow as a PIII at the same clock speed.

Buy a lot of memory. And make sure the motherboard you get can support a lot
of memory (i.e., it has 4 or more slots for RAM).

Backups: DDS3/4 tape is a good, but expensive choice. It's one of the few backup
technologies that has cheap media and is vendor-neutral. Stay away from
"Big floppies" like Jaz drives. They're too flaky.

> Is PostgreSQL even the best database engine for this app? Perhaps MySQL? Or
> maybe a Microsoft solution?

MySQL is traditionally faster at pumping out web pages with pure speed, but
PostgreSQL has been catching up very quickly. Also, PostgreSQL has
traditionally been able to handle many more concurrent users, but I think
MySQL has been getting better there, too. I haven't seen any banchmarks
with recent versions of either database. They're both free, so try some
tests with both of them (but don't forget to tune them properly).

If you care anything about cost or flexibility, I wouldn't go for a Microsoft
"solution".

--Jeremy

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Ross J. Reedstrom 2002-04-22 22:51:46 Re: An Analyze question
Previous Message Tom Lane 2002-04-22 20:32:37 Re: An Analyze question

Browse pgsql-novice by date

  From Date Subject
Next Message Curt Sampson 2002-04-23 02:34:32 Re: Hardware needed for 15,000,000 record DB?
Previous Message Joshua b. Jore 2002-04-22 13:13:27 Re: triggers