Re: bulk insert performance problem

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Christian Bourque <christian(dot)bourque(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: bulk insert performance problem
Date: 2008-04-08 03:18:48
Message-ID: 47FAE418.3000406@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Christian Bourque wrote:
> Hi,
>
> I have a performance problem with a script that does massive bulk
> insert in 6 tables. When the script starts the performance is really
> good but will degrade minute after minute and take almost a day to
> finish!
>
Would I be correct in guessing that there are foreign key relationships
between those tables, and that there are significant numbers of indexes
in use?

The foreign key checking costs will go up as the tables grow, and AFAIK
the indexes get a bit more expensive to maintain too.

If possible you should probably drop your foreign key relationships and
drop your indexes, insert your data, then re-create the indexes and
foreign keys. The foreign keys will be rechecked when you recreate them,
and it's *vastly* faster to do it that way. Similarly, building an index
from scratch is quite a bit faster than progressively adding to it. Of
course, dropping the indices is only useful if you aren't querying the
tables as you build them.

Also, if you're loading data using stored procedures you should avoid
the use of exception blocks. I had some major problems with my bulk data
conversion code due to overuse of exception blocks creating large
numbers of subtransactions behind the scenes and slowing everything to a
crawl.

--
Craig Ringer

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Chris 2008-04-08 03:32:56 Re: bulk insert performance problem
Previous Message Christian Bourque 2008-04-08 03:01:18 bulk insert performance problem