Quick Links

Costly "Sort Key" on indexed timestamp column

From:	<ogjunk-pgjedan(at)yahoo(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Costly "Sort Key" on indexed timestamp column
Date:	2004-09-09 22:51:37
Message-ID:	20040909225137.77822.qmail@web12701.mail.yahoo.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Hello,

I'm tuning my PostgreSQL DB (7.3.4) and have come across a query that
doesn't use an index I created specially for it, and consequently takes
circa 2 seconds to run. :(

The ugly query looks like this (the important part is really at the
very end - order by piece):

select userinfo1_.id as id0_, servicepla3_.id as id1_, account2_.id as
id2_, passwordhi4_.id as id3_, userdemogr5_.id as id4_,
userinfo1_.first_name as first_name0_, userinfo1_.last_name as
last_name0_, userinfo1_.email as email0_, userinfo1_.href as href0_,
userinfo1_.last_login_date as last_log6_0_, userinfo1_.login_count as
login_co7_0_, userinfo1_.password_hint_answer as password8_0_,
userinfo1_.create_date as create_d9_0_, userinfo1_.exp_date as
exp_date0_, userinfo1_.type as type0_, userinfo1_.account_id as
account_id0_, userinfo1_.plan_id as plan_id0_,
userinfo1_.password_hint_id as passwor14_0_,
userinfo1_.user_demographic_id as user_de15_0_, servicepla3_.name as
name1_, servicepla3_.max_links as max_links1_, account2_.username as
username2_, account2_.password as password2_, account2_.status as
status2_, passwordhi4_.question as question3_, userdemogr5_.city as
city4_, userdemogr5_.postal_code as postal_c3_4_,
userdemogr5_.country_id as country_id4_,
userdemogr5_.state_id as state_id4_, userdemogr5_.gender_id as
gender_id4_ from user_preference userprefer0_ inner join user_info
userinfo1_ on userprefer0_.user_id=userinfo1_.id inner join account
account2_ on userinfo1_.account_id=account2_.id inner join service_plan
servicepla3_ on userinfo1_.plan_id=servicepla3_.id left outer join
password_hint passwordhi4_ on
userinfo1_.password_hint_id=passwordhi4_.id inner join user_demographic
userdemogr5_ on userinfo1_.user_demographic_id=userdemogr5_.id,
preference preference6_, preference_value preference7_ where
(preference6_.name='allow_subscribe' and
userprefer0_.preference_id=preference6_.id)AND(preference7_.value=1
and userprefer0_.preference_value_id=preference7_.id) order by
userinfo1_.create_date desc limit 10;

The output of EXPLAIN ANALYZE follows. Note how 99% of the total cost
comes from "Sort Key: userinfo1_.create_date". When I saw this, I
created an index for this:

CREATE INDEX ix_user_info_create_date ON user_info(create_date);

But that didn't seem to make much of a difference. The total cost did
go down from about 1250 to 1099, but that's still too high.

---------------------------------------------------------
Limit (cost=1099.35..1099.38 rows=10 width=222) (actual
time=1914.13..1914.17 rows=10 loops=1)
-> Sort (cost=1099.35..1099.43 rows=31 width=222) (actual
time=1914.12..1914.14 rows=11 loops=1)
Sort Key: userinfo1_.create_date
-> Hash Join (cost=90.71..1098.60 rows=31 width=222) (actual
time=20.34..1908.41 rows=767 loops=1)
Hash Cond: ("outer".preference_value_id = "inner".id)
-> Hash Join (cost=89.28..1092.58 rows=561 width=218)
(actual time=19.92..1886.59 rows=768 loops=1)
Hash Cond: ("outer".preference_id = "inner".id)
-> Hash Join (cost=88.10..1045.14 rows=7850
width=214) (actual time=19.44..1783.47 rows=9984 loops=1)
Hash Cond: ("outer".user_demographic_id =
"inner".id)
-> Hash Join (cost=72.59..864.51 rows=8933
width=190) (actual time=14.83..1338.15 rows=9984 loops=1)
Hash Cond: ("outer".password_hint_id =
"inner".id)
-> Hash Join (cost=71.50..726.87
rows=8933 width=161) (actual time=14.53..1039.69 rows=9984 loops=1)
Hash Cond: ("outer".plan_id =
"inner".id)
-> Hash Join
(cost=70.42..569.46 rows=8933 width=144) (actual time=14.26..700.80
rows=9984 loops=1)
Hash Cond:
("outer".account_id = "inner".id)
-> Hash Join
(cost=53.83..390.83 rows=10073 width=116) (actual time=9.67..373.71
rows=9984 loops=1)
Hash Cond:
("outer".user_id = "inner".id)
-> Seq Scan on
user_preference userprefer0_ (cost=0.00..160.73 rows=10073 width=12)
(actual time=0.09..127.64 rows=9984 loops=1)
-> Hash
(cost=51.66..51.66 rows=866 width=104) (actual time=9.40..9.40 rows=0
loops=1)
-> Seq Scan
on user_info userinfo1_ (cost=0.00..51.66 rows=866 width=104) (actual
time=0.12..7.15 rows=768 loops=1)
-> Hash
(cost=14.68..14.68 rows=768 width=28) (actual time=4.45..4.45 rows=0
loops=1)
-> Seq Scan on
account account2_ (cost=0.00..14.68 rows=768 width=28) (actual
time=0.10..2.56 rows=768 loops=1)
-> Hash (cost=1.06..1.06
rows=6 width=17) (actual time=0.13..0.13 rows=0 loops=1)
-> Seq Scan on
service_plan servicepla3_ (cost=0.00..1.06 rows=6 width=17) (actual
time=0.10..0.11 rows=6 loops=1)
-> Hash (cost=1.07..1.07 rows=7
width=29) (actual time=0.15..0.15 rows=0 loops=1)
-> Seq Scan on password_hint
passwordhi4_ (cost=0.00..1.07 rows=7 width=29) (actual
time=0.11..0.13 rows=7 loops=1)
-> Hash (cost=13.61..13.61 rows=761
width=24) (actual time=4.46..4.46 rows=0 loops=1)
-> Seq Scan on user_demographic
userdemogr5_ (cost=0.00..13.61 rows=761 width=24) (actual
time=0.10..2.73 rows=769 loops=1)
-> Hash (cost=1.18..1.18 rows=1 width=4) (actual
time=0.16..0.16 rows=0 loops=1)
-> Seq Scan on preference preference6_
(cost=0.00..1.18 rows=1 width=4) (actual time=0.14..0.15
rows=1 loops=1)
Filter: (name =
'allow_subscribe'::character varying)
-> Hash (cost=1.43..1.43 rows=2 width=4) (actual
time=0.23..0.23 rows=0 loops=1)
-> Seq Scan on preference_value preference7_
(cost=0.00..1.43 rows=2 width=4) (actual time=0.17..0.21
rows=3 loops=1)
Filter: ((value)::text = '1'::text)
Total runtime: 1914.91 msec
(35 rows)

There are a few Seq Scan's, but they are benign, as their low/no cost
shows - they are very small, 'static' tables (e.g. country list, state
list, preference names list).

Does anyone have any ideas how I could speed up this query?

Thanks,
Otis

Responses

Re: Costly "Sort Key" on indexed timestamp column at 2004-09-10 04:17:11 from Tom Lane

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Neil Conway	2004-09-09 23:06:23	Re: Multiple Uniques
Previous Message	Oleg Bartunov	2004-09-09 17:56:59	Re: Text Search vs MYSQL vs Lucene