Re: [GENERAL] Hardware optimising

From: "Aaron J(dot) Seigo" <aaron(at)gtv(dot)ca>
To: pgsql-general(at)postgreSQL(dot)org
Subject: Re: [GENERAL] Hardware optimising
Date: 1999-08-26 23:18:10
Message-ID: 37C5CB32.8E379739@gtv.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

hi...

> 128 MB 100 MHz SDRAM
> AMD K6-2/300 CPU
> 10 GB 7200RPM 9.0ms IBM IDE HDD
>
> It will, over the next few months, as money becomes available, be upgraded to:
>
> 256 MB 100 MHz SDRAM
> Dual Athlon 500 CPUs
> 10 GB UltraII Wide SCSI drive
>
> The database will contain several million records and needs to be able to do
> very fast selects from tables with a lot of rows, and do small updates and
> inserts onto these tables at a good speed also.

seems there are a lot of opinions on this one floating around, and not
alot of explanations to go along with them. =)

in my experience, it really depends on what you are doing. if you are
going to be doing the same selects alot, then extra RAM will help a lot.
just be sure to set the buffers when starting postmaster to be
relatively high (i.e. a few thousand). also, since you are doing few
inserts, i'd turn off f-sync for greater speed and less disk access
(which with your current IDE will be expensive (time wise) at best).
this will run the risk of losing inserts if the machine crashes, but if
you are using a stable OS (i.e. not NT) then you'll probably be just
fine if the inserts are few compared to selects.

also, if you are going to be doing a lot of pre- and post-processing of
the data (i.e. grabbing bits of data based on a algorythm (sp) or
getting bits of data and massaging them about a lot (i.e. creating
graphs, lay-out, doing analysis, etc)) then RAM will also see a boost as
you will be able to do these in memory, allowing the database the disk
more to itself...

the SCSI drive will see an increase in speed to be sure! in fact, i'd
suggest giving the database the drive all to itself for data... leave
everything else on the IDE drive (OS, database engine, etc) and format
the SCSI drive with large i-node blocks (i.e. 1MB) and just let the data
reside on the SCSI disk. besides gaining the speed of the disk you'll
also allow the rest of the system to stay the hell out of the way of
that disk intensive database! =) you'll probably see a tremendous
increase in speed doing it this way (large inodes, only database data)
than if you just simply replace the IDE with the SCSI drive...

of course, as time gos on, if you use a mirroring raid array by adding
another disk, you'll see even more speed increase. other RAIDs, while
preserving your data, will result in slow downs.. though it will still
be faster than an IDE drive with everything on it. RAID 5 is cheaper
(more out of your disk space) but will be a bit slower than a single
disk system or a RAID 5... but RAID 5 is a nice way to go...

however, if you do go RAID, DO NOT use software RAID. why? well... it
negates some of the fail-safe power of the RAID (although if well set up
you can render this moot) but more importantly it will drag
significantly on your processors (~10% or so is common)

as for the processor, this will see an increase, of course. note,
however, that since PostgreSQL is _not_ multithreaded, that it will run
only on one of the processors. (i'm about to assume you are using linux
here... 'scuse me if i'm wrong) however, the good news is that you can
encourage linux (through the scheduler) to run postgres on one of the
processors and everything else on the other one. this should give the
database its own processor more oft than not. things may still drift,
etc... but it will be better this way....

the processor boost will be important, again, if you are doing lots of
pre/post-processing of data. it will also see an improvement if you are
offering other services (i.e. WWW) on the machine (which i'm guessing
you are). this will require a kernel recompile and some muckin' about to
get it all running as quickly/smoothly/efficiently as possible.

this is another side of things to look at:

RAM is quick and instant. power down, slap in some more ram, power up.
fast down time.

DISK upgrade will take more time. i.e. formatting; setting it in your
FSTAB, etc; changing your start up scripts to tell postgres where the
data is now; copying things to the new disk, etc... this will result in
some fairly good down time. the installation can be done quickly (if
well thought out, i.e. pre-format the drive, etc...) and the rest can be
done while online. although you'll want to shut the database down while
copying data files. the longer you wait on this one, the longer your
down time will be (more data to copy, etc...)

CPU upgrade will require downtime to install (not nearly as fast or easy
as RAM).. then kernel recompiling.. then testing of the new kernel...
then tweaking the system. probably resulting in even more down time than
with the disk upgrade.

i'm guessing that while the system is new, you'll probably be more
agreeable to longer downtimes. so perhaps the disk upgrade would be
better earlier on in that it will probably give you the best improvement
while absorbing down time impact early in on the venture when it might
not be noticed so much (i'm guessing here again as to the nature of your
usage... assuming the demands on the system will start out smallish and
grow as time gos on...)

second, i'd do the RAM upgrade. cheap, fast, good improvements.

third, i'd tackle the CPUs (tricky, not as cheap, fair amount of
downtime...)

but that's just me. and those who know me know that i'm often off to one
side of the field. usually looking at the clouds, in fact. =)

as a bit of last advice, SysAdmin magazine, numerous O'Reilly books, and
many online FAQs and HOW-TOs give some really good advice on these sorts
of issues. these aren't really database related as much as they are
systems administration questions and are applicable to most high-demand
services. get the books/mags, read incessently, keep up to date on
what's what, etc, etc and your new service can only stand to benefit.

Aaron Seigo
somebody who types a lot during the course of a day.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message sam smith 1999-08-26 23:45:06 PL/pgSQL and arrays
Previous Message Phil Oelkers 1999-08-26 17:52:32 pg_class missing