Skip site navigation (1) Skip section navigation (2)

Re: Very poor performance loading 100M of sql data using copy

From: John Rouillard <rouilj(at)renesys(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Very poor performance loading 100M of sql data using copy
Date: 2008-04-28 18:00:53
Message-ID: 20080428180053.GL6622@renesys.com (view raw or flat)
Thread:
Lists: pgsql-performance
On Mon, Apr 28, 2008 at 06:53:09PM +0100, Heikki Linnakangas wrote:
> John Rouillard wrote:
> >We are running postgresql-8.1.3 under Centos 4
> You should upgrade, at least to the latest minor release of the 8.1 
> series (8.1.11), as there has been a bunch of important bug and security 
> fixes. Or even better, upgrade to 8.3, which has reduced the storage 
> size of especially variable length datatypes like text/char/varchar in 
> particular. As your COPY is I/O bound, reducing storage size will 
> translate directly to improved performance.

Yup. Just saw that suggestion in an unrelated email.
 
> >dm-6 is where the data files reside and dm-4 is where the WAL archives
> >are kept. Note all the DM's are on the same RAID 0 device /dev/sda2.
> 
> Another reason to upgrade to 8.3: if you CREATE or TRUNCATE the table in 
> the same transaction as you COPY into it, you can avoid WAL logging of 
> the loaded data, which will in the best case double your performance as 
> your WAL is on the same physical drives as the data files.

We can't do this as we are backfilling a couple of months of data into
tables with existing data.
 
> >The only indexes we have to drop are the ones on the primary keys
> >(there is one non-primary key index in the database as well).
> >
> >Can you drop an index on the primary key for a table and add it back
> >later?  Am I correct in saying: the primary key index is what enforces
> >the unique constraint in the table? If the index is dropped and
> >non-unique primary key data has been added, what happens when you
> >re-add the index?
> 
> Yes, the index is what enforces the uniqueness. You can drop the primary 
> key constraint, and add it back after the load with ALTER TABLE. If the 
> load introduces any non-unique primary keys, adding the primary key 
> constraint will give you an error and fail.

That's the part I am worried about. I guess using psql to delete the
problem row then re-adding the index will work.
 
> Dropping and recreating the indexes is certainly worth trying.

Thanks for the info.

-- 
				-- rouilj

John Rouillard
System Administrator
Renesys Corporation
603-244-9084 (cell)
603-643-9300 x 111

In response to

Responses

pgsql-performance by date

Next:From: Greg SmithDate: 2008-04-28 18:16:02
Subject: Re: Very poor performance loading 100M of sql data using copy
Previous:From: Heikki LinnakangasDate: 2008-04-28 17:53:09
Subject: Re: Very poor performance loading 100M of sql data using copy

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group