Re: bulk insert performance problem

From: PFC <lists(at)peufeu(dot)com>
To: "Christian Bourque" <christian(dot)bourque(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: bulk insert performance problem
Date: 2008-04-08 22:48:35
Message-ID: op.t9bdm9bicigqcu@apollo13.peufeu.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> I have a performance problem with a script that does massive bulk
> insert in 6 tables. When the script starts the performance is really
> good but will degrade minute after minute and take almost a day to
> finish!

Looks like foreign key checks slow you down.

- Batch INSERTS in transactions (1000-10000 per transaction)
- Run ANALYZE once in a while so the FK checks use indexes
- Are there any DELETEs in your script which might hit nonidexed
REFERENCES... columns to cascade ?
- Do you really need to check for FKs on the fly while inserting ?
ie. do you handle FK violations ?
Or perhaps your data is already consistent ?
In this case, load the data without any constraints (and without any
indexes), and add indexes and foreign key constraints after the loading is
finished.
- Use COPY instead of INSERT.

If you use your script to process data, perhaps you could import raw
unprocessed data in a table (with COPY) and process it with SQL. This is
usually much faster than doing a zillion inserts.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Luigi N. Puleio 2008-04-09 09:51:50 EXPLAIN detail
Previous Message PFC 2008-04-08 22:40:16 Re: recommendations for web/db connection pooling or DBD::Gofer reviews