Efficient DELETE Strategies

From: Christoph Haller <ch(at)rodos(dot)fzk(dot)de>
To: pgsql-sql(at)postgresql(dot)org
Subject: Efficient DELETE Strategies
Date: 2002-06-10 13:42:10
Message-ID: 200206101142.NAA16854@rodos
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-sql

Hi,

Based on an entry in the mailing list from 30 Oct 2001
about efficient deletes on subqueries,
I've found two ways to do so (PostgreSQL 7.2.1):

1.
BEGIN ;
EXPLAIN ANALYZE
DELETE FROM onfvalue WHERE EXISTS(
SELECT * FROM onfvalue j WHERE
j.sid= 5 AND
onfvalue.lid = j.lid AND
onfvalue.mid = j.mid AND
onfvalue.timepoint = j.timepoint AND
onfvalue.entrancetime < j.entrancetime
) ;
ROLLBACK ;
QUERY PLAN:

Seq Scan on onfvalue
(cost=0.00..805528.05 rows=66669 width=6)
(actual time=61.84..25361.82 rows=24 loops=1)
SubPlan
-> Index Scan using advncd_onfvalue_idx_stlme on onfvalue j
(cost=0.00..6.02 rows=1 width=36)
(actual time=0.14..0.14 rows=0 loops=133338)
Total runtime: 25364.76 msec

2.
BEGIN ;
EXPLAIN ANALYZE
INSERT INTO temprefentrancetime(timepoint,lid,mid,sid,entrancetime)
SELECT o.timepoint,o.lid,o.mid,o.sid,o.entrancetime
FROM onfvalue o join onfvalue j ON (
o.lid = j.lid AND
o.mid = j.mid AND
o.timepoint = j.timepoint AND
o.entrancetime < j.entrancetime
) WHERE o.sid= 5 ;
EXPLAIN ANALYZE
DELETE FROM onfvalue WHERE
onfvalue.timepoint = temprefentrancetime.timepoint AND
onfvalue.mid = temprefentrancetime.mid AND
onfvalue.lid = temprefentrancetime.lid AND
onfvalue.sid = temprefentrancetime.sid AND
onfvalue.entrancetime = temprefentrancetime.entrancetime ;
DELETE FROM temprefentrancetime;
ROLLBACK ;
QUERY PLAN:

Merge Join
(cost=16083.12..16418.36 rows=4 width=52)
(actual time=17728.06..19325.02 rows=24 loops=1)
-> Sort
(cost=2152.53..2152.53 rows=667 width=28)
(actual time=1937.70..2066.46 rows=16850 loops=1)
-> Index Scan using advncd_onfvalue_idx_stlme on onfvalue o
(cost=0.00..2121.26 rows=667 width=28)
(actual time=0.57..709.89 rows=16850 loops=1)
-> Sort
(cost=13930.60..13930.60 rows=133338 width=24)
(actual time=13986.07..14997.43 rows=133110 loops=1)
-> Seq Scan on onfvalue j
(cost=0.00..2580.38 rows=133338 width=24)
(actual time=0.15..3301.06 rows=133338 loops=1)
Total runtime: 19487.49 msec

QUERY PLAN:

Nested Loop
(cost=0.00..6064.40 rows=1 width=62)
(actual time=1.34..8.32 rows=24 loops=1)
-> Seq Scan on temprefentrancetime
(cost=0.00..20.00 rows=1000 width=28)
(actual time=0.44..1.07 rows=24 loops=1)
-> Index Scan using advncd_onfvalue_idx_stlme on onfvalue
(cost=0.00..6.02 rows=1 width=34)
(actual time=0.22..0.25 rows=1 loops=24)
Total runtime: 10.15 msec

The questions are:
Is there a way to put the second form (more complicated, but faster)
in one statement?
Or is there even a third way to delete, which I cannot see?
Regards, Christoph

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Karel Zak 2002-06-10 13:43:34 Re: Timestamp/Interval proposals: Part 2
Previous Message NunoACHenriques 2002-06-10 13:39:01 Re: tuplesort: unexpected end of data

Browse pgsql-sql by date

  From Date Subject
Next Message Joseph Syjuco 2002-06-10 13:51:18 Re: arrays as pgsql function parameters
Previous Message Tom Lane 2002-06-10 13:22:28 Re: Rule to fill in value on column on insert