Re: bulk insert performance problem

From: Chris <dmagick(at)gmail(dot)com>
To: Christian Bourque <christian(dot)bourque(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: bulk insert performance problem
Date: 2008-04-08 03:32:56
Message-ID: 47FAE768.80200@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Craig Ringer wrote:
> Christian Bourque wrote:
>> Hi,
>>
>> I have a performance problem with a script that does massive bulk
>> insert in 6 tables. When the script starts the performance is really
>> good but will degrade minute after minute and take almost a day to
>> finish!
>>
> Would I be correct in guessing that there are foreign key relationships
> between those tables, and that there are significant numbers of indexes
> in use?
>
> The foreign key checking costs will go up as the tables grow, and AFAIK
> the indexes get a bit more expensive to maintain too.
>
> If possible you should probably drop your foreign key relationships and
> drop your indexes, insert your data, then re-create the indexes and
> foreign keys. The foreign keys will be rechecked when you recreate them,
> and it's *vastly* faster to do it that way. Similarly, building an index
> from scratch is quite a bit faster than progressively adding to it. Of
> course, dropping the indices is only useful if you aren't querying the
> tables as you build them.

If you are, add "analyze" commands through the import, eg every 10,000
rows. Then your checks should be a bit faster.

The other suggestion would be to do block commits:

begin;
do stuff for 5000 rows;
commit;

repeat until finished.

--
Postgresql & php tutorials
http://www.designmagick.com/

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Potluri Srikanth 2008-04-08 03:41:47 bulk data loading
Previous Message Craig Ringer 2008-04-08 03:18:48 Re: bulk insert performance problem