Quick Links

Re: Performace Optimization for Dummies

From:	"Merlin Moncure" <mmoncure(at)gmail(dot)com>
To:	"Carlo Stonebanks" <stonec(dot)register(at)sympatico(dot)ca>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Performace Optimization for Dummies
Date:	2006-09-28 20:06:56
Message-ID:	b42b73150609281306r4806ab8fy7ddbb2fac31276d2@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On 9/28/06, Carlo Stonebanks <stonec(dot)register(at)sympatico(dot)ca> wrote:
> > are you using the 'copy' interface?
>
> Straightforward inserts - the import data has to transformed, normalised and
> de-duped by the import program. I imagine the copy interface is for more
> straightforward data importing. These are - buy necessity - single row
> inserts.

right. see comments below.

> > thats a tough question. my gut says that windows will not scale as
> > well as recent linux kernels in high load environments.
>
> But not in the case of a single import program trying to seed a massive
> database?

probably not.

> > hearing good things about the woodcrest. pre-woodcrest xeon (dempsey
> > down) is outclassed by the opteron.
>
> Need to find a way to deterimine the Xeon type. The server was bought in
> early 2006, and it looks like woodcrest was form July.

ok, there are better chips out there but again this is not something
you would really notice outside of high load environements.

> > 1. can probably run fsync=off during the import
> > 2. if import is single proecess, consider temporary bump to memory for
> > index creation. or, since you have four cores consider having four
> > processes import the data somehow.
> > 3. turn off stats collector, stats_command_string, stats_row_level,
> > and autovacuum during import.

by the way, stats_command_string is a known performance killer that
iirc was improved in 8.2. just fyi.

I would suggest at least consideration of retooling your import as
follows...it might be a fun project to learn some postgresql
internals. I'm assuming you are doing some script preprocessing in a
language like perl:

bulk load denomalized tables into scratch tables into the postgresql
database. create indexes appropriate to the nomalization process
remembering you can index on virtually any expression in postgresql
(including regex substitution).

use sql to process the data. if tables are too large to handle with
monolithic queries, use cursors and/or functions to handle the
conversion. now you can keep track of progress using pl/pgsql raise
notice command for example.

merlin

In response to

Re: Performace Optimization for Dummies at 2006-09-28 17:53:22 from Carlo Stonebanks

Responses

Re: Performace Optimization for Dummies at 2006-09-29 04:46:54 from Carlo Stonebanks

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Carlo Stonebanks	2006-09-28 20:15:03	Re: Performace Optimization for Dummies
Previous Message	Carlo Stonebanks	2006-09-28 19:47:15	Re: Performace Optimization for Dummies