Skip site navigation (1) Skip section navigation (2)

Re: Better way to bulk-load millions of CSV records into postgres?

From: Tom Sheehan <tomas(at)csi(dot)com>
To: PgSQL Novice ML <pgsql-novice(at)postgresql(dot)org>
Subject: Re: Better way to bulk-load millions of CSV records into postgres?
Date: 2002-05-22 14:19:31
Message-ID: 001b01c2019b$b34ad620$6a01010a@DOMAIN1.com (view raw or flat)
Thread:
Lists: pgsql-novice
Have you looked at the COPY command is psql  for this?  There are the, 'FROM
{ 'filename' | stdin } and [ [USING] DELIMITERS] clauses that may be of use.
It seems to me that bulk loading is what they were designed for.

ts


----- Original Message -----
From: "Ron Johnson" <ron(dot)l(dot)johnson(at)cox(dot)net>
To: "PgSQL Novice ML" <pgsql-novice(at)postgresql(dot)org>
Sent: Tuesday, May 21, 2002 3:40 PM
Subject: [NOVICE] Better way to bulk-load millions of CSV records into
postgres?


>
> Hi,
>
> Currently, I've got a python script using pyPgSQL that
> parses the CSV record, creates a string that is a big
> "INSERT INTO VALUES (...)" command, then, execute() it.
>
> top shows that this method uses postmaster with ~70% CPU
> utilization, and python with ~15% utilization.
>
> Still, it's only inserting ~190 recs/second.  Is there a
> better way to do this, or am I constrained by the hardware?
>
> Instead of python and postmaster having to do a ton of data
> xfer over sockets, I'm wondering if there's a way to send a
> large number of csv records (4000, for example) in one big
> chunk to a stored procedure and have the engine process it
> all.
>
> Linux 2.4.18
> PostgreSQL 7.2.1
> python 2.1.3
> csv file on /dev/hda
> table on /dev/hde  (ATA/100)
>
> --
> +---------------------------------------------------------+
> | Ron Johnson, Jr.        Home: ron(dot)l(dot)johnson(at)cox(dot)net     |
> | Jefferson, LA  USA      http://ronandheather.dhs.org:81 |
> |                                                         |
> | "I have created a government of whirled peas..."        |
> |   Maharishi Mahesh Yogi, 12-May-2002,                   |
> !   CNN, Larry King Live                                  |
> +---------------------------------------------------------+
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly

In response to

Responses

pgsql-novice by date

Next:From: Tom LaneDate: 2002-05-22 14:24:48
Subject: Re: pl/perl Documentation
Previous:From: Tom LaneDate: 2002-05-22 14:02:27
Subject: Re: Undead record haunts my database, need exorcism

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group