From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | "Jakab Laszlo" <jakablaszlo(at)sofasoft(dot)ro>, "Justin Clift" <justin(at)postgresql(dot)org> |
Cc: | <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: performance issues for processing more then 150000 |
Date: | 2003-02-23 20:44:06 |
Message-ID: | 200302231244.06242.josh@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Jakab,
Some simple tips, which is what I thing you were really asking for:
Are you adding the records in a single overnight or downtime load batch? If
so, the fastest way by far is to:
1) disable all triggers and constraints on the table temporarily, and some or
all of the indexes
2) put all the data into a COPY file (tab-delimited text; see COPY docs)
3) load the data through a COPY statement
4) re-enable the trigger and constraints and re-build the indexes
The degree to which you need to do 1) and 4) depends on how much punishment
your system can take; start out by dropping and rebuilding just the triggers
and move up from there until the load finishes in a satisfactory time.
If the records are being added on a continuous basis and not in a big batch,
then take the following precautions:
1) put as many inserts as possible into transaction batches
2) minimize your use of constraints, triggers, and indexes on the tables being
loaded
3) consdier using a "buffer table" to hold records about to be loaded while
data integrity checks and similar are performed.
> Unfortunatelly the hardware budget should be keept as low as possible.
> I was thinking is there could be reliable solution based on dual processor
> and ATA 133 raid mirroring normally with some gigs of memory.
1 gig of RAM may be plenty. Your main bottleneck will be your disk channel.
If I were setting up your server, I might do something like:
1) buy a motherboard with 4 ATA controllers.
2) put disks like:
channel 0: 1 matched pair disks
channel 1 + 2: 1 matched quartet of disks
channel 3: single ATA disk
(for Postgresql, more, smaller disks is almost always better than a few big
ones.) (alternately, set up everythin in one big RAID 5 array with at least 6
disks. There is argument about which is better)
3) Format the above as a RAID 1 pair on channel 0 and a RAID 1+0 double pair
on channel 1 using Linux software RAID
4) Put Linux OS + swap on channel 0. Put the database on channel 1+2. Put
the pg_xlog (the transaction log) on channel 3. Make sure to use a version
of Linux with kernel 2.4.19 or greater!
That's just one configuration of several possible, of course, but may serve
your purposes admirably and relatively cheaply.
--
Josh Berkus
Aglio Database Solutions
San Francisco
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2003-02-23 20:52:30 | Re: slow query |
Previous Message | Tom Lane | 2003-02-22 04:23:47 | Re: Really bad insert performance: what did I do wrong? |