Skip site navigation (1) Skip section navigation (2)

Re: Better way to bulk-load millions of CSV records into postgres?

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Ron Johnson <ron(dot)l(dot)johnson(at)cox(dot)net>,PgSQL Novice ML <pgsql-novice(at)postgresql(dot)org>
Subject: Re: Better way to bulk-load millions of CSV records into postgres?
Date: 2002-05-21 23:39:25
Message-ID: 200205211639.25237.josh@agliodbs.com (view raw or flat)
Thread:
Lists: pgsql-novice
Ron,

> Currently, I've got a python script using pyPgSQL that 
> parses the CSV record, creates a string that is a big
> "INSERT INTO VALUES (...)" command, then, execute() it.

What's wrong with the COPY command?

> top shows that this method uses postmaster with ~70% CPU
> utilization, and python with ~15% utilization.
> 
> Still, it's only inserting ~190 recs/second.  Is there a
> better way to do this, or am I constrained by the hardware?

This sounds pretty good for an ATA system.   Upgrading to SCSI-RAID will also 
improve your performance.

-Josh Berkus

In response to

pgsql-novice by date

Next:From: Victor Manuel Torres AguirreDate: 2002-05-22 00:50:33
Subject: PostgreSQL+Access97+Linux: How to..
Previous:From: Tom LaneDate: 2002-05-21 23:27:47
Subject: Re: Large tables being split at 1GB boundary

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group