Re: Best COPY Performance

From: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To: Worky Workerson <worky(dot)workerson(at)gmail(dot)com>
Cc: "Craig A(dot) James" <cjames(at)modgraph-usa(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Best COPY Performance
Date: 2006-10-25 15:28:13
Message-ID: 20061025152812.GQ26892@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Oct 25, 2006 at 08:03:38AM -0400, Worky Workerson wrote:
> I'm just doing CSV style transformations (and calling a lot of
> functions along the way), but the end result is a straight bulk load
> of data into a blank database. And we've established that Postgres
> can do *way* better than what I am seeing, so its not suprising that
> perl is using 100% of a CPU.

If you're loading into an empty database, there's a number of tricks
that will help you:

Turn off fsync
Add constraints and indexes *after* you've loaded the data (best to add
as much of them as possible on a per-table basis right after the table
is loaded so that it's hopefully still in cache)
Crank up maintenance_work_mem, especially for tables that won't fit into
cache anyway
Bump up checkpoint segments and wal_buffers.
Disable PITR
Create a table and load it's data in a single transaction (8.2 will
avoid writing any WAL data if you do this and PITR is turned off)
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jim C. Nasby 2006-10-25 15:32:37 Re: Problems using a function in a where clause
Previous Message Worky Workerson 2006-10-25 15:25:01 Re: Best COPY Performance