Skip site navigation (1) Skip section navigation (2)

Re: Best COPY Performance

From: "Spiegelberg, Greg" <gspiegelberg(at)cranel(dot)com>
To: "Luke Lonergan" <llonergan(at)greenplum(dot)com>,"Worky Workerson" <worky(dot)workerson(at)gmail(dot)com>,"Merlin Moncure" <mmoncure(at)gmail(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Best COPY Performance
Date: 2006-10-30 14:09:32
Message-ID: 82E74D266CB9B44390D3CCE44A781ED90177807C@POSTOFFICE.cranel.local (view raw or flat)
Thread:
Lists: pgsql-performance
> -----Original Message-----
> From: pgsql-performance-owner(at)postgresql(dot)org 
> [mailto:pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of 
> Luke Lonergan
> Sent: Saturday, October 28, 2006 12:07 AM
> To: Worky Workerson; Merlin Moncure
> Cc: pgsql-performance(at)postgresql(dot)org
> Subject: Re: [PERFORM] Best COPY Performance
> 
> Worky,
> 
> On 10/27/06 8:47 PM, "Worky Workerson" 
> <worky(dot)workerson(at)gmail(dot)com> wrote:
> 
> > Are you saying that I should be able to issue multiple COPY 
> commands 
> > because my I/O wait is low?  I was under the impression 
> that I am I/O 
> > bound, so multiple simeoultaneous loads would have a detrimental 
> > effect ...
> 
> ... 
> I agree with Merlin that you can speed things up by breaking 
> the file up.
> Alternately you can use the OSS Bizgres java loader, which 
> lets you specify the number of I/O threads with the "-n" 
> option on a single file.

As a result of this thread, and b/c I've tried this in the past but
never had much success at speeding the process up, I attempted just that
here except via 2 psql CLI's with access to the local file.  1.1M rows
of data varying in width from 40 to 200 characters COPY'd to a table
with only one text column, no keys, indexes, &c took about 15 seconds to
load. ~73K rows/second.

I broke that file into 2 files each of 550K rows and performed 2
simultaneous COPY's after dropping the table, recreating, issuing a sync
on the system to be sure, &c and nearly every time both COPY's finish in
12 seconds.  About a 20% gain to ~91K rows/second.

Admittedly, this was a pretty rough test but a 20% savings, if it can be
put into production, is worth exploring for us.

B/c I'll be asked, I did this on an idle, dual 3.06GHz Xeon with 6GB of
memory, U320 SCSI internal drives and PostgreSQL 8.1.4.

Greg

Responses

pgsql-performance by date

Next:From: Luke LonerganDate: 2006-10-30 14:23:07
Subject: Re: Best COPY Performance
Previous:From: Steinar H. GundersonDate: 2006-10-30 12:27:33
Subject: Re: Strange plan in pg 8.1.0

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group