Skip site navigation (1) Skip section navigation (2)

improving performance for a delete

From: kevin kempter <kevin(at)kevinkempterllc(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: improving performance for a delete
Date: 2008-05-20 19:51:45
Message-ID: 1F6D96EA-31C8-47D0-9D72-664639C387C0@kevinkempterllc.com (view raw or flat)
Thread:
Lists: pgsql-performance
Hi all;

I have 2 tables where I basically want to delete from the first table   
(seg_id_tmp7) any rows where the entire row already exists in the  
second table (sl_cd_segment_dim)

I have a query that looks like this (and it's slow):


delete from seg_id_tmp7
where
	customer_srcid::text ||
	show_srcid::text ||
	show_name::text ||
	season_srcid::text ||
	season_name::text ||
	episode_srcid::text ||
	episode_name::text ||
	segment_type_id::text ||
	segment_type::text ||
	segment_srcid::text ||
	segment_name::text
in
	( select
		customer_srcid::text ||
		show_srcid::text ||
		show_name::text ||
		season_srcid::text ||
		season_name::text ||
		episode_srcid::text ||
		episode_name::text ||
		segment_type_id::text ||
		segment_type::text ||
		segment_srcid::text ||
		segment_name::text
		from sl_cd_location_dim )
;





Here's the query plan for it:

                                     QUERY PLAN
-----------------------------------------------------------------------------------
  Seq Scan on seg_id_tmp7  (cost=0.00..138870701.56 rows=2136 width=6)
    Filter: (subplan)
    SubPlan
      ->  Seq Scan on sl_cd_location_dim  (cost=0.00..63931.60  
rows=433040 width=8)
(4 rows)








I also tried this:

delete from seg_id_tmp7
where
	( customer_srcid ,
	show_srcid ,
	show_name ,
	season_srcid ,
	season_name ,
	episode_srcid ,
	episode_name ,
	segment_type_id ,
	segment_type ,
	segment_srcid ,
	segment_name )
in
	( select
		customer_srcid ,
		show_srcid ,
		show_name ,
		season_srcid ,
		season_name ,
		episode_srcid ,
		episode_name ,
		segment_type_id ,
		segment_type ,
		segment_srcid ,
		segment_name
		from sl_cd_location_dim )
;


and I get this query plan:

                                     QUERY PLAN
-----------------------------------------------------------------------------------
  Seq Scan on seg_id_tmp7  (cost=0.00..87997034.20 rows=2136 width=6)
    Filter: (subplan)
    SubPlan
      ->  Seq Scan on sl_cd_location_dim  (cost=0.00..40114.40  
rows=433040 width=8)
(4 rows)



If it helps here's the describe's (including indexes) for both tables:

# \d seg_id_tmp7
                 Table "public.seg_id_tmp7"
      Column      |            Type             | Modifiers
-----------------+-----------------------------+-----------
  customer_srcid  | bigint                      |
  show_srcid      | bigint                      |
  show_name       | character varying           |
  season_srcid    | bigint                      |
  season_name     | character varying           |
  episode_srcid   | bigint                      |
  episode_name    | character varying           |
  segment_type_id | bigint                      |
  segment_type    | character varying           |
  segment_srcid   | bigint                      |
  segment_name    | character varying           |
  create_dt       | timestamp without time zone |




# \d sl_cd_segment_dim
                                          Table  
"public.sl_cd_segment_dim"
         Column        |            Type              
|                          Modifiers
----------------------+----------------------------- 
+-------------------------------------------------------------
  sl_cd_segment_dim_id | bigint                      | not null  
default nextval('sl_cd_segment_dim_seq'::regclass)
  customer_srcid       | bigint                      | not null
  show_srcid           | bigint                      | not null
  show_name            | character varying(500)      | not null
  season_srcid         | bigint                      | not null
  season_name          | character varying(500)      | not null
  episode_srcid        | bigint                      | not null
  episode_name         | character varying(500)      | not null
  segment_type_id      | integer                     |
  segment_type         | character varying(500)      |
  segment_srcid        | bigint                      |
  segment_name         | character varying(500)      |
  effective_dt         | timestamp without time zone | not null  
default now()
  inactive_dt          | timestamp without time zone |
  last_update_dt       | timestamp without time zone | not null  
default now()
Indexes:
     "sl_cd_segment_dim_pk" PRIMARY KEY, btree (sl_cd_segment_dim_id)
     "seg1" btree (customer_srcid)
     "seg2" btree (show_srcid)
     "seg3" btree (season_srcid)
     "seg4" btree (episode_srcid)
     "seg5" btree (segment_srcid)
     "sl_cd_segment_dim_ix1" btree (customer_srcid)






Any thoughts, suggestions, etc on how to improve performance for this  
delete ?


Thanks in advance..

/Kevin



Responses

pgsql-performance by date

Next:From: kevin kempterDate: 2008-05-20 20:03:30
Subject: Re: improving performance for a delete
Previous:From: Scott MarloweDate: 2008-05-20 17:47:13
Subject: Re: slow update

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group