Skip site navigation (1) Skip section navigation (2)

Best COPY Performance

From: "Worky Workerson" <worky(dot)workerson(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Best COPY Performance
Date: 2006-10-20 20:05:33
Message-ID: ce4072df0610201305u7836dc7bn437ec0aaf1c3e25f@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
What is the best COPY performance that you have gotten on a "normal" table?

I know that this is question is almost too general, but it might help
me out a bit, or at least give me the right things to tweak.  Perhaps
the question can be rewritten as "Where are the major bottlenecks in a
COPY?" or "How can I compute the max theoretical COPY performance for
my hardware?".  The two subquestions that I have from this are:
  -Are my ETL scripts (perl) maximizing the database COPY speeds?
  -Can I tweak my DB further to eek out a bit more performance?

I'm using perl to ETL a decent sized data set (10 million records) and
then loading it through perl::DBI's copy.  I am currently getting
between 10K and 15K inserts/second.  I've profiled the ETL scripts a
bit and have performance-improved a lot of the code, but I'd like to
determine whether it makes sense to try and further optimize my Perl
or count it as "done" and look for improvements elsewhere.

I ran trivial little insert into a table with a single integer row and
came close to 250K inserts/second using psql's \copy, so I'm thinking
that my code could be optimized a bit more, but wanted to check around
to see if that was the case.

I am most interested in loading two tables, one with about 21 (small)
VARCHARs where each record is about 200 bytes, and another with 7
INTEGERs, 3 TIMESTAMPs, and 1 BYTEA where each record is about 350
bytes.

I have implemented most of the various bits of PG config advice that I
have seen, both here and with a some googling, such as:

 wal_buffers=128
 checkpoint_segments=128
 checkpoint_timeout=3000

Software:  PG 8.1.3 on RHEL 4.3 x86_64
Hardware: Quad Dual-core Opteron, Fibre Channel SAN with 256M BBC

Thanks!

Responses

pgsql-performance by date

Next:From: alvisDate: 2006-10-20 20:26:02
Subject: Re: New hardware thoughts
Previous:From: Joshua D. DrakeDate: 2006-10-20 18:52:09
Subject: Re: New hardware thoughts

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group