Skip site navigation (1) Skip section navigation (2)

Re: Best COPY Performance

From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Luke Lonergan <llonergan(at)greenplum(dot)com>
Cc: "Spiegelberg, Greg" <gspiegelberg(at)cranel(dot)com>, Worky Workerson <worky(dot)workerson(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Best COPY Performance
Date: 2006-10-30 16:23:44
Message-ID: 45462710.3000301@kaltenbrunner.cc (view raw or flat)
Thread:
Lists: pgsql-performance
Luke Lonergan wrote:
> Stefan,
> 
> On 10/30/06 8:57 AM, "Stefan Kaltenbrunner" <stefan(at)kaltenbrunner(dot)cc> wrote:
> 
>>> We've found that there is an ultimate bottleneck at about 12-14MB/s despite
>>> having sequential write to disk speeds of 100s of MB/s.  I forget what the
>>> latest bottleneck was.
>> I have personally managed to load a bit less then 400k/s (5 int columns
>> no indexes) - on very fast disk hardware - at that point postgresql is
>> completely CPU bottlenecked (2,6Ghz Opteron).
> 
> 400,000 rows/s x 4 bytes/column x 5 columns/row = 8MB/s
> 
>> Using multiple processes to load the data will help to scale up to about
>>   900k/s (4 processes on 4 cores).

yes I did that about half a year ago as part of the CREATE INDEX on a 
1,8B row table thread on -hackers that resulted in some some the sorting 
improvements in 8.2.
I don't think there is much more possible in terms of import speed by 
using more cores (at least not when importing to the same table) - iirc 
I was at nearly 700k/s with two cores and 850k/s with 3 cores or such ...

> 
> 18MB/s?  Have you done this?  I've not seen this much of an improvement
> before by using multiple COPY processes to the same table.
> 
> Another question: how to measure MB/s - based on the input text file?  On
> the DBMS storage size?  We usually consider the input text file in the
> calculation of COPY rate.


yeah that is a good questions (and part of the reason why I cited the 
rows/sec number btw.)


Stefan

In response to

pgsql-performance by date

Next:From: Bucky JordanDate: 2006-10-30 16:39:22
Subject: Re: commit so slow program looks frozen
Previous:From: Stefan KaltenbrunnerDate: 2006-10-30 15:57:19
Subject: Re: Best COPY Performance

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group