Extremely slow query...

From: pathat(at)comcast(dot)net
To: pgsql-performance(at)postgresql(dot)org
Subject: Extremely slow query...
Date: 2004-07-29 15:54:49
Message-ID: 072920041554.15818.41091DC90005607500003DCA22007374789B0E089B0E9F@comcast.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I have the following query that I have to quit after 10 mins. I have vacuum full analyzed each of the tables and still nothing seems to help. There are indexes on all the join fields. I've included the query, explain and the table structures. Please let me know if there is anything else I should provide.

SELECT m.r_score,m.f_score,m.m_score,m.rfm_score,ed.email_id, '00'::varchar as event_id,
COUNT(e.email_adr) AS count
FROM cdm.cdm_indiv_mast m
INNER JOIN cdm.cdm_email_data e on e.indiv_fkey = m.indiv_key
INNER JOIN cdm.email_sent es on es.email_address = e.email_adr
inner join cdm.email_description ed on ed.email_id = es.email_id
where (ed.event_date::date between '2003-10-07' and '2003-12-31')
and m.m_score >0
GROUP BY 1,2,3,4,5,6

QUERY PLAN
HashAggregate (cost=551716.04..551724.14 rows=3241 width=58)
-> Hash Join (cost=417315.57..546391.36 rows=304267 width=58)
Hash Cond: (("outer".email_address)::text = ("inner".email_adr)::text)
-> Nested Loop (cost=0.00..85076.89 rows=1309498 width=42)
-> Seq Scan on email_description ed (cost=0.00..20.72 rows=3 width=19)
Filter: (((event_date)::date >= '2003-10-07'::date) AND ((event_date)::date <= '2003-12-31'::date))
-> Index Scan using emailsnt_id_idx on email_sent es (cost=0.00..20964.10 rows=591036 width=42)
Index Cond: (("outer".email_id)::text = (es.email_id)::text)
-> Hash (cost=404914.28..404914.28 rows=1202517 width=39)
-> Hash Join (cost=144451.53..404914.28 rows=1202517 width=39)
Hash Cond: ("outer".indiv_fkey = "inner".indiv_key)
-> Seq Scan on cdm_email_data e (cost=0.00..93002.83 rows=5175383 width=31)
-> Hash (cost=134399.24..134399.24 rows=1202517 width=24)
-> Index Scan using m_score_idx on cdm_indiv_mast m (cost=0.00..134399.24 rows=1202517 width=24)
Index Cond: (m_score > 0)

CREATE TABLE cdm.cdm_email_data
(
email_adr varchar(75) NOT NULL,
opt_out char(1) DEFAULT 'n'::bpchar,
indiv_fkey int8 NOT NULL,
CONSTRAINT email_datuniq UNIQUE (email_adr, indiv_fkey)
) WITHOUT OIDS;
CREATE INDEX emaildat_email_idx ON cdm.cdm_email_data USING btree (email_adr);

CREATE TABLE cdm.cdm_indiv_mast
(
name_first varchar(20),
name_middle varchar(20),
name_last varchar(30),
name_suffix varchar(5),
addr1 varchar(40),
addr2 varchar(40),
addr3 varchar(40),
city varchar(25),
state varchar(7),
r_score int4,
f_score int4,
m_score int4,
rfm_score int4,
rfm_segment int4,
CONSTRAINT indiv_mast_pk PRIMARY KEY (indiv_key)
) WITH OIDS;
CREATE INDEX f_score_idx ON cdm.cdm_indiv_mast USING btree (f_score);
CREATE INDEX m_score_idx ON cdm.cdm_indiv_mast USING btree (m_score);
CREATE INDEX r_score_idx ON cdm.cdm_indiv_mast USING btree (r_score);

CREATE TABLE cdm.email_description
(
email_id varchar(20) NOT NULL,
event_date timestamp,
affiliate varchar(75),
event_name varchar(100),
mailing varchar(255),
category varchar(100),
div_code varchar(30),
mkt_category varchar(50),
merch_code varchar(50),
campaign_code varchar(50),
offer_code varchar(30),
CONSTRAINT email_desc_pk PRIMARY KEY (email_id)
) WITHOUT OIDS;
CREATE INDEX email_desc_id_idx ON cdm.email_description USING btree (email_id);
CREATE INDEX eml_desc_date_idx ON cdm.email_description USING btree (event_date);

CREATE TABLE cdm.email_sent
(
email_address varchar(75),
email_id varchar(20),
email_sent_ts timestamp,
email_type char(1)
) WITHOUT OIDS;
CREATE INDEX emailsnt_id_idx ON cdm.email_sent USING btree (email_id);
CREATE INDEX email_sent_emailidx ON cdm.email_sent USING btree (email_address);

-TIA
-Patrick Hatcher

Browse pgsql-performance by date

  From Date Subject
Next Message Stephan Szabo 2004-07-29 16:08:55 Re: Optimizer refuses to hash join
Previous Message Tom Lane 2004-07-28 23:11:06 Re: best way to fetch next/prev record based on index