Horribly slow hash join

From: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: Horribly slow hash join
Date: 2004-04-16 15:45:02
Message-ID: 20040416154502.GF87362@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Note the time for the hash join step:


------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=357.62..26677.99 rows=93668 width=62) (actual time=741.159..443381.011 rows=49091 loops=1)
Hash Cond: ("outer".work_today = "inner".work_units)
-> Hash Join (cost=337.11..24784.11 rows=93668 width=54) (actual time=731.374..417188.519 rows=49091 loops=1)
Hash Cond: ("outer".work_total = "inner".work_units)
-> Seq Scan on email_rank (cost=0.00..22240.04 rows=254056 width=46) (actual time=582.145..1627.759 rows=49091 loops=1)
Filter: (project_id = 8)
-> Hash (cost=292.49..292.49 rows=17849 width=16) (actual time=148.944..148.944 rows=0 loops=1)
-> Seq Scan on rank_tie_overall o (cost=0.00..292.49 rows=17849 width=16) (actual time=0.059..75.984 rows=17849 loops=1)
-> Hash (cost=17.81..17.81 rows=1081 width=16) (actual time=8.996..8.996 rows=0 loops=1)
-> Seq Scan on rank_tie_today d (cost=0.00..17.81 rows=1081 width=16) (actual time=0.080..4.635 rows=1081 loops=1)
Total runtime: 619047.032 ms

By comparison:
stats=# set enable_hashjoin=false;
SET
stats=# explain analyze select * from email_rank, rank_tie_overall o, rank_tie_today d WHERE email_rank.work_today = d.work_units AND email_rank.work_total = o.work_units AND email_rank.project_id = :ProjectID;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=55391.69..56823.23 rows=93668 width=80) (actual time=2705.344..3349.318 rows=49091 loops=1)
Merge Cond: ("outer".work_units = "inner".work_today)
-> Index Scan using work_units_today on rank_tie_today d (cost=0.00..23.89 rows=1081 width=16) (actual time=0.150..4.874 rows=1081 loops=1)
-> Sort (cost=55391.69..55625.86 rows=93668 width=64) (actual time=2705.153..2888.039 rows=49091 loops=1)
Sort Key: email_rank.work_today
-> Merge Join (cost=45047.64..47656.93 rows=93668 width=64) (actual time=1685.414..2494.342 rows=49091 loops=1)
Merge Cond: ("outer".work_units = "inner".work_total)
-> Index Scan using work_units_overall on rank_tie_overall o (cost=0.00..361.34 rows=17849 width=16) (actual time=0.122..79.383 rows=17849 loops=1)
-> Sort (cost=45047.64..45682.78 rows=254056 width=48) (actual time=1685.228..1866.215 rows=49091 loops=1)
Sort Key: email_rank.work_total
-> Seq Scan on email_rank (cost=0.00..22240.04 rows=254056 width=48) (actual time=786.515..1289.101 rows=49091 loops=1)
Filter: (project_id = 8)
Total runtime: 3548.087 ms

Even though the second case is only a select, it seems clear that
something's wrong...
--
Jim C. Nasby, Database Consultant jim(at)nasby(dot)net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Mike Nolan 2004-04-16 15:46:02 Re: Long running queries degrade performance
Previous Message Robert Treat 2004-04-16 15:37:15 Re: query slows down with more accurate stats