Skip site navigation (1) Skip section navigation (2)

Re: Help speeding up delete

From: Steve Wampler <swampler(at)noao(dot)edu>
To: Magnus Hagander <mha(at)sollentuna(dot)net>
Cc: Postgres-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Help speeding up delete
Date: 2005-11-15 14:18:12
Message-ID: 4379EE24.8000504@noao.edu (view raw or flat)
Thread:
Lists: pgsql-performance
Magnus Hagander wrote:
>>Because I think we need to.  The above would only delete rows 
>>that have name = 'obsid' and value = 'oid080505'.  We need to 
>>delete all rows that have the same ids as those rows.  
>>However, from what you note, I bet we could do:
>>
>>   DELETE FROM "tmp_table2" WHERE id IN
>>      (SELECT id FROM "temp_table2" WHERE name = 'obsid' and 
>>value= 'oid080505');
>>
>>However, even that seems to have a much higher cost than I'd expect:
>>
>>   lab.devel.configdb=# explain delete from "tmp_table2" where id in
>>        (select id from tmp_table2 where name='obsid' and 
>>value = 'oid080505');
>>   NOTICE:  QUERY PLAN:
>>
>>   Seq Scan on tmp_table2  (cost=0.00..65705177237.26 
>>rows=769844 width=6)
>>     SubPlan
>>       ->  Materialize  (cost=42674.32..42674.32 rows=38 width=50)
>>             ->  Seq Scan on tmp_table2  (cost=0.00..42674.32 
>>rows=38 width=50)
>>
>>   EXPLAIN
...
> 
> Earlier pg versions have always been bad at dealing with IN subqueries.
> Try rewriting it as (with fixing any broken syntax, I'm not actually
> testing this :P)
> 
> DELETE FROM tmp_table2 WHERE EXISTS 
>  (SELECT * FROM tmp_table2 t2 WHERE t2.id=tmp_table2.id AND
> t2.name='obsid' AND t2.value='oid080505')

Thanks - that looks *significantly* better:

   lab.devel.configdb=# explain delete from tmp_table2 where exists
                           (select 1 from tmp_table2 t2 where
                           t2.id=tmp_table2.id and
                           t2.name='obsid' and t2.value='oid080505');
   NOTICE:  QUERY PLAN:

   Seq Scan on tmp_table2  (cost=0.00..9297614.80 rows=769844 width=6)
     SubPlan
       ->  Index Scan using inv_index_2 on tmp_table2 t2  (cost=0.00..6.02 rows=1 width=0)

   EXPLAIN

(This is after putting an index on the (id,name,value) tuple.)  That outer seq scan
is still annoying, but maybe this will be fast enough.

I've passed this on, along with the (strong) recommendation that they
upgrade PG.

Thanks!!

-- 
Steve Wampler -- swampler(at)noao(dot)edu
The gods that smiled on your birth are now laughing out loud.

In response to

Responses

pgsql-performance by date

Next:From: Luke LonerganDate: 2005-11-15 14:33:25
Subject: Re: Hardware/OS recommendations for large databases (
Previous:From: Dave CramerDate: 2005-11-15 14:15:04
Subject: Re: Hardware/OS recommendations for large databases (

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group