Re: Better way to bulk-load millions of CSV records into postgres?

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Ron Johnson <ron(dot)l(dot)johnson(at)cox(dot)net>, PgSQL Novice ML <pgsql-novice(at)postgresql(dot)org>
Subject: Re: Better way to bulk-load millions of CSV records into postgres?
Date: 2002-05-21 23:39:25
Message-ID: 200205211639.25237.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice


Ron,

> Currently, I've got a python script using pyPgSQL that
> parses the CSV record, creates a string that is a big
> "INSERT INTO VALUES (...)" command, then, execute() it.

What's wrong with the COPY command?

> top shows that this method uses postmaster with ~70% CPU
> utilization, and python with ~15% utilization.
>
> Still, it's only inserting ~190 recs/second. Is there a
> better way to do this, or am I constrained by the hardware?

This sounds pretty good for an ATA system. Upgrading to SCSI-RAID will also
improve your performance.

-Josh Berkus

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message Victor Manuel Torres Aguirre 2002-05-22 00:50:33 PostgreSQL+Access97+Linux: How to..
Previous Message Tom Lane 2002-05-21 23:27:47 Re: Large tables being split at 1GB boundary