Re: Best COPY Performance

From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Stefan Kaltenbrunner" <stefan(at)kaltenbrunner(dot)cc>
Cc: "Spiegelberg, Greg" <gspiegelberg(at)cranel(dot)com>, "Worky Workerson" <worky(dot)workerson(at)gmail(dot)com>, "Merlin Moncure" <mmoncure(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Best COPY Performance
Date: 2006-10-30 15:03:41
Message-ID: C16B625D.5708%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Stefan,

On 10/30/06 8:57 AM, "Stefan Kaltenbrunner" <stefan(at)kaltenbrunner(dot)cc> wrote:

>> We've found that there is an ultimate bottleneck at about 12-14MB/s despite
>> having sequential write to disk speeds of 100s of MB/s. I forget what the
>> latest bottleneck was.
>
> I have personally managed to load a bit less then 400k/s (5 int columns
> no indexes) - on very fast disk hardware - at that point postgresql is
> completely CPU bottlenecked (2,6Ghz Opteron).

400,000 rows/s x 4 bytes/column x 5 columns/row = 8MB/s

> Using multiple processes to load the data will help to scale up to about
> 900k/s (4 processes on 4 cores).

18MB/s? Have you done this? I've not seen this much of an improvement
before by using multiple COPY processes to the same table.

Another question: how to measure MB/s - based on the input text file? On
the DBMS storage size? We usually consider the input text file in the
calculation of COPY rate.

- Luke

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2006-10-30 15:09:29 Re: Strange plan in pg 8.1.0
Previous Message Mattias Kregert 2006-10-30 14:26:09 Re: Strange plan in pg 8.1.0