Re: Hardware HD choice...

From: Ivan Voras <ivoras(at)freebsd(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Hardware HD choice...
Date: 2008-10-23 20:47:45
Message-ID: gdqnto$a8i$1@ger.gmane.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Lionel wrote:
> Hello,
>
> I have to choose a dedicated server to host a big 8.3 database.
> The global size of the database (indexes included) will grow by 40 Go every
> year (40 millions of lines/year)
> Real data (indexes excluded) will be around 5-7 Go/year.
> I need to store 4 years of activity.
> Very few simultaneous users (~4).
> 100000 rows added every day via csv imports.
> The application will be a reporting application.
> Main statements: aggregation of 10000 to 10millions of line.
> Vast majority will hit 300000 lines (90% of the connected users), few will
> hit more than 10 millions (10% of the connected users, there may never be 2
> simultaneous users of this category).
> 20sec - 30sec for such a statement is acceptable.
>
> It's quite easy to choose CPU (xeon quad core 2.66, maybe dual xeon), RAM
> (8-12Go) but I still hesitate for hard disks.

If the number of users is low or your queries are complex, a faster CPU
with fewer cores will serve you better because PostgreSQL cannot split a
single query across multiple CPUs/cores. It will also speed up your CSV
imports (be sure to do them as a single transaction or with COPY). If
you can bear the heat (and the electricity bill) you can get 3.4 GHz
Xeons. Be sure to use a 64-bit OS and lots of memory (16 GB+).

> Option 5)
> RAID10 SATA2 4x250 Go

Good enough.

> Any other better option that I could ask for ?

Yes, 8x250 :) You need as much drives as possible - not for capacity or
reliability but for speed. Use RAID 5 or RAID 6 only if the database
isn't going to be updated often (for example, if you add records to the
database only several times a day, it's ok).

> What would be the best choice in case of an external USB drive : using it
> for indexes or x_log ?

Skip any USB-connected drives for production environments. It's not
worth it. If new data isn't going to be added to the database
continuously, you don't need a separate x_log. Better use the drive as a
(hot, if possible) spare for RAID in case one of the RAID drives
malfunctions.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Alvaro Herrera 2008-10-23 20:51:56 Re: Annoying Reply-To
Previous Message Greg Smith 2008-10-23 20:44:50 Re: Annoying Reply-To