Skip site navigation (1) Skip section navigation (2)

Re: Performace Optimization for Dummies

From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: "Carlo Stonebanks" <stonec(dot)register(at)sympatico(dot)ca>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Performace Optimization for Dummies
Date: 2006-09-28 20:06:56
Message-ID: b42b73150609281306r4806ab8fy7ddbb2fac31276d2@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
On 9/28/06, Carlo Stonebanks <stonec(dot)register(at)sympatico(dot)ca> wrote:
> > are you using the 'copy' interface?
>
> Straightforward inserts - the import data has to transformed, normalised and
> de-duped by the import program. I imagine the copy interface is for more
> straightforward data importing. These are - buy necessity - single row
> inserts.

right. see comments below.

> > thats a tough question.  my gut says that windows will not scale as
> > well as recent linux kernels in high load environments.
>
> But not in the case of a single import program trying to seed a massive
> database?

probably not.

> > hearing good things about the woodcrest. pre-woodcrest xeon (dempsey
> > down) is outclassed by the opteron.
>
> Need to find a way to deterimine the Xeon type. The server was bought in
> early 2006, and it looks like woodcrest was form July.

ok, there are better chips out there but again this is not something
you would really notice outside of high load environements.

> > 1. can probably run fsync=off during the import
> > 2. if import is single proecess, consider temporary bump to memory for
> > index creation. or, since you have four cores consider having four
> > processes import the data somehow.
> > 3. turn off stats collector, stats_command_string, stats_row_level,
> > and autovacuum during import.

by the way, stats_command_string is a known performance killer that
iirc was improved in 8.2. just fyi.

I would suggest at least consideration of retooling your import as
follows...it might be a fun project to learn some postgresql
internals.  I'm assuming you are doing some script preprocessing in a
language like perl:

bulk load denomalized tables into scratch tables into the postgresql
database. create indexes appropriate to the nomalization process
remembering you can index on virtually any expression in postgresql
(including regex substitution).

use sql to process the data. if tables are too large to handle with
monolithic queries, use cursors and/or functions to handle the
conversion.  now you can keep track of progress using pl/pgsql raise
notice command for example.

merlin

In response to

Responses

pgsql-performance by date

Next:From: Carlo StonebanksDate: 2006-09-28 20:15:03
Subject: Re: Performace Optimization for Dummies
Previous:From: Carlo StonebanksDate: 2006-09-28 19:47:15
Subject: Re: Performace Optimization for Dummies

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group