Re: Very poor performance loading 100M of sql data using copy

From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "John Rouillard" <rouilj(at)renesys(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Very poor performance loading 100M of sql data using copy
Date: 2008-04-28 17:53:09
Message-ID: 48160F05.2090207@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

John Rouillard wrote:
> We are running postgresql-8.1.3 under Centos 4

You should upgrade, at least to the latest minor release of the 8.1
series (8.1.11), as there has been a bunch of important bug and security
fixes. Or even better, upgrade to 8.3, which has reduced the storage
size of especially variable length datatypes like text/char/varchar in
particular. As your COPY is I/O bound, reducing storage size will
translate directly to improved performance.

> dm-6 is where the data files reside and dm-4 is where the WAL archives
> are kept. Note all the DM's are on the same RAID 0 device /dev/sda2.

Another reason to upgrade to 8.3: if you CREATE or TRUNCATE the table in
the same transaction as you COPY into it, you can avoid WAL logging of
the loaded data, which will in the best case double your performance as
your WAL is on the same physical drives as the data files.

> The only indexes we have to drop are the ones on the primary keys
> (there is one non-primary key index in the database as well).
>
> Can you drop an index on the primary key for a table and add it back
> later? Am I correct in saying: the primary key index is what enforces
> the unique constraint in the table? If the index is dropped and
> non-unique primary key data has been added, what happens when you
> re-add the index?

Yes, the index is what enforces the uniqueness. You can drop the primary
key constraint, and add it back after the load with ALTER TABLE. If the
load introduces any non-unique primary keys, adding the primary key
constraint will give you an error and fail.

Dropping and recreating the indexes is certainly worth trying.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message John Rouillard 2008-04-28 18:00:53 Re: Very poor performance loading 100M of sql data using copy
Previous Message Greg Smith 2008-04-28 17:39:39 Re: Replication Syatem