Slow Bulk Delete

From: thilo <thilo(dot)tanner(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Slow Bulk Delete
Date: 2010-05-08 11:39:58
Message-ID: f923f5db-a6b1-4029-8ebe-a6d1cc35932a@e2g2000yqn.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi all!

We moved from MySQL to Postgresql for some of our projects. So far
we're very impressed with the performance (especially INSERTs and
UPDATEs), except for a strange problem with the following bulk delete
query:

DELETE FROM table1 WHERE table2_id = ?

I went through these Wiki pages, trying to solve the problem:

http://wiki.postgresql.org/wiki/SlowQueryQuestions and
http://wiki.postgresql.org/wiki/Performance_Optimization

but unfortunately without much luck.

Our application is doing batch jobs. On every batch run, we must
delete approx. 1M rows in table1 and recreate these entries. The
inserts are very fast, but deletes are not. We cannot make updates,
because there's no identifying property in the objects of table1.

This is what EXPLAIN is telling me:

EXPLAIN ANALYZE DELETE FROM table1 WHERE table2_id = 11242939
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------
Index Scan using sr_index on table1 (cost=0.00..8.56 rows=4 width=6)
(actual time=0.111..0.154 rows=4 loops=1)
Index Cond: (table2_id = 11242939)
Total runtime: 0.421 ms
(3 rows)

This seems to be very fast (using the index), but running this query
from JDBC takes up to 20ms each. For 1M rows this sum up to several
hours. When I have a look at pg_top psql uses most of the time for the
deletes. CPU usage is 100% (for the core used by postgresql). So it
seems that postgresql is doing some sequential scanning or constraint
checks.

This is the table structure:

id bigint (primary key)
table2_id bigint (foreign key constraint to table 2, *indexed*)
table3_id bigint (foreign key constraint to table 3, *indexed*)
some non-referenced text and boolean fields

My server settings (Potgresql 8.4.2):

shared_buffers = 1024MB
effective_cache_size = 2048MB
work_mem = 128MB
wal_buffers = 64MB
checkpoint_segments = 32
checkpoint_timeout = 15min
checkpoint_completion_target = 0.9

It would be very nice to give me a hint to solve the problem. It
drives me crazy ;-)

If you need more details please feel free to ask!

Thanks in advance for your help!

Kind regards

Thilo

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Andy Colson 2010-05-08 13:17:13 Re: Slow Bulk Delete
Previous Message Josh Berkus 2010-05-08 00:09:45 8K recordsize bad on ZFS?