Query performance with disabled hashjoin and mergejoin

From: Ivan Voras <ivoras(at)freebsd(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Query performance with disabled hashjoin and mergejoin
Date: 2011-02-04 12:08:23
Message-ID: iigq7o$3ub$1@dough.gmane.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I'm running all this on a 9.0 server with good enough hardware. The
query is:

SELECT news.id AS news_id
, news.layout_id
, news.news_relation_id
, news.author_id
, news.date_created
, news.date_published
, news.lastedit
, news.lastedit_user_id
, news.lastedit_date
, news.approved_by
, news.state
, news.visible_from
, news.visible_to
, news.archived_by
, news.archived_date
, news.priority
, news.collection_id
, news.comment
, news.keywords
, news.icon
, news.icon_url
, news.icon_width
, news.icon_height
, news.icon_position
, news.icon_onclick
, news.icon_newwindow
, news.no_lead
, news.content_exists
, news.title, news.lead, news.content

, author.public_name AS
author_public_name
, lastedit_user.public_name AS
lastedit_user_public_name
, approved_by_user.public_name AS
approved_by_public_name
, archived_by_user.public_name AS
archived_by_public_name
FROM news
JOIN users AS author ON news.author_id
= author.id
LEFT JOIN users AS lastedit_user ON
news.lastedit_user_id = lastedit_user.id
LEFT JOIN users AS approved_by_user ON
news.approved_by = approved_by_user.id
LEFT JOIN users AS archived_by_user ON
news.archived_by = archived_by_user.id

WHERE (news.layout_id = 8980) AND (state =
2) AND (date_published <= 1296806570 AND (visible_from IS NULL OR
1296806570 BETWEEN visible_f
rom AND visible_to))
ORDER BY priority DESC, date_published DESC
;

The "vanilla" plan, with default settings is:

Sort (cost=7325.84..7329.39 rows=1422 width=678) (actual
time=100.846..100.852 rows=7 loops=1)
Sort Key: news.priority, news.date_published
Sort Method: quicksort Memory: 38kB
-> Hash Left Join (cost=2908.02..7251.37 rows=1422 width=678)
(actual time=100.695..100.799 rows=7 loops=1)
Hash Cond: (news.archived_by = archived_by_user.id)
-> Hash Left Join (cost=2501.75..6819.47 rows=1422
width=667) (actual time=76.742..76.830 rows=7 loops=1)
Hash Cond: (news.approved_by = approved_by_user.id)
-> Hash Left Join (cost=2095.48..6377.69 rows=1422
width=656) (actual time=53.248..53.318 rows=7 loops=1)
Hash Cond: (news.lastedit_user_id = lastedit_user.id)
-> Hash Join (cost=1689.21..5935.87 rows=1422
width=645) (actual time=29.793..29.846 rows=7 loops=1)
Hash Cond: (news.author_id = author.id)
-> Bitmap Heap Scan on news
(cost=1282.94..5494.05 rows=1422 width=634) (actual time=5.532..5.560
rows=7 loops=1)
Recheck Cond: ((layout_id = 8980) AND
(state = 2) AND ((visible_from IS NULL) OR (1296806570 <= visible_to)))
Filter: ((date_published <=
1296806570) AND ((visible_from IS NULL) OR ((1296806570 >= visible_from)
AND (1296806570 <= visible_to))))
-> BitmapAnd (cost=1282.94..1282.94
rows=1430 width=0) (actual time=5.508..5.508 rows=0 loops=1)
-> Bitmap Index Scan on
news_index_layout_id_state (cost=0.00..150.14 rows=2587 width=0)
(actual time=0.909..0.909 rows=3464 loops=1)
Index Cond: ((layout_id =
8980) AND (state = 2))
-> BitmapOr
(cost=1132.20..1132.20 rows=20127 width=0) (actual time=4.136..4.136
rows=0 loops=1)
-> Bitmap Index Scan on
news_visible_from (cost=0.00..1122.09 rows=19976 width=0) (actual
time=3.367..3.367 rows=19932 loops=1)
Index Cond:
(visible_from IS NULL)
-> Bitmap Index Scan on
news_visible_to (cost=0.00..9.40 rows=151 width=0) (actual
time=0.766..0.766 rows=43 loops=1)
Index Cond:
(1296806570 <= visible_to)
-> Hash (cost=281.12..281.12 rows=10012
width=15) (actual time=24.247..24.247 rows=10012 loops=1)
Buckets: 1024 Batches: 1 Memory
Usage: 482kB
-> Seq Scan on users author
(cost=0.00..281.12 rows=10012 width=15) (actual time=0.004..11.354
rows=10012 loops=1)
-> Hash (cost=281.12..281.12 rows=10012
width=15) (actual time=23.444..23.444 rows=10012 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 482kB
-> Seq Scan on users lastedit_user
(cost=0.00..281.12 rows=10012 width=15) (actual time=0.004..10.752
rows=10012 loops=1)
-> Hash (cost=281.12..281.12 rows=10012 width=15)
(actual time=23.481..23.481 rows=10012 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 482kB
-> Seq Scan on users approved_by_user
(cost=0.00..281.12 rows=10012 width=15) (actual time=0.002..10.695
rows=10012 loops=1)
-> Hash (cost=281.12..281.12 rows=10012 width=15) (actual
time=23.941..23.941 rows=10012 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 482kB
-> Seq Scan on users archived_by_user
(cost=0.00..281.12 rows=10012 width=15) (actual time=0.003..10.673
rows=10012 loops=1)
Total runtime: 101.302 ms
(35 rows)

But with these changes:

set enable_hashjoin=f;
set enable_mergejoin=f;

the plan becomes:

Sort (cost=9786.25..9789.87 rows=1446 width=678) (actual
time=5.408..5.414 rows=7 loops=1)
Sort Key: news.priority, news.date_published
Sort Method: quicksort Memory: 38kB
-> Nested Loop Left Join (cost=439.10..9710.35 rows=1446
width=678) (actual time=5.133..5.364 rows=7 loops=1)
-> Nested Loop Left Join (cost=439.10..8459.74 rows=1446
width=667) (actual time=5.128..5.330 rows=7 loops=1)
-> Nested Loop Left Join (cost=439.10..7209.12
rows=1446 width=656) (actual time=5.122..5.271 rows=7 loops=1)
-> Nested Loop (cost=439.10..5958.51 rows=1446
width=645) (actual time=5.112..5.204 rows=7 loops=1)
-> Bitmap Heap Scan on news
(cost=439.10..4707.89 rows=1446 width=634) (actual time=5.096..5.122
rows=7 loops=1)
Recheck Cond: ((layout_id = 8980) AND
(state = 2) AND ((visible_from IS NULL) OR (1296806570 <= visible_to)))
Filter: ((date_published <=
1296806570) AND ((visible_from IS NULL) OR ((1296806570 >= visible_from)
AND (1296806570 <= visible_to))))
-> BitmapAnd (cost=439.10..439.10
rows=1455 width=0) (actual time=5.073..5.073 rows=0 loops=1)
-> Bitmap Index Scan on
news_index_layout_id_state (cost=0.00..58.62 rows=2637 width=0) (actual
time=0.880..0.880 rows=3464 loops=1)
Index Cond: ((layout_id =
8980) AND (state = 2))
-> BitmapOr
(cost=379.86..379.86 rows=20084 width=0) (actual time=3.734..3.734
rows=0 loops=1)
-> Bitmap Index Scan on
news_visible_from (cost=0.00..373.74 rows=19932 width=0) (actual
time=3.255..3.255 rows=19932 loops=1)
Index Cond:
(visible_from IS NULL)
-> Bitmap Index Scan on
news_visible_to (cost=0.00..5.39 rows=152 width=0) (actual
time=0.476..0.476 rows=43 loops=1)
Index Cond:
(1296806570 <= visible_to)
-> Index Scan using users_pkey on users
author (cost=0.00..0.85 rows=1 width=15) (actual time=0.006..0.007
rows=1 loops=7)
Index Cond: (author.id = news.author_id)
-> Index Scan using users_pkey on users
lastedit_user (cost=0.00..0.85 rows=1 width=15) (actual
time=0.004..0.005 rows=1 loops=7)
Index Cond: (news.lastedit_user_id =
lastedit_user.id)
-> Index Scan using users_pkey on users
approved_by_user (cost=0.00..0.85 rows=1 width=15) (actual
time=0.002..0.004 rows=1 loops=7)
Index Cond: (news.approved_by = approved_by_user.id)
-> Index Scan using users_pkey on users archived_by_user
(cost=0.00..0.85 rows=1 width=15) (actual time=0.001..0.001 rows=0 loops=7)
Index Cond: (news.archived_by = archived_by_user.id)
Total runtime: 5.605 ms
(27 rows)

Note the difference in execution times: 100 ms vs 5 ms.

So far, I've tried increasing statistics to 1000 on state, layout_id,
author_id, lastedit_user_id, approved_by, archived_by fields, reindexing
and vacuum analyze-ing it, but with the default settings the planner
keeps missing the mark.

The news table is:

Table "public.news"
Column | Type |
Modifiers
------------------+------------------------+---------------------------------------------------
id | integer | not null default
nextval('news_id_seq'::regclass)
layout_id | integer | not null
news_relation_id | integer | not null
author_id | integer | not null default 10
date_created | integer | not null
date_published | integer | not null
lastedit | boolean | not null default false
lastedit_user_id | integer | not null default 10
lastedit_date | integer | not null
approved_by | integer | default 10
state | smallint | not null
visible_from | integer |
visible_to | integer |
archived_by | integer | default 10
archived_date | integer |
priority | smallint | not null default 5
collection_id | integer |
comment | boolean | not null default false
keywords | text | not null default ''::text
icon | boolean | not null default false
icon_url | text |
icon_width | smallint |
icon_height | smallint |
icon_position | character(1) |
icon_onclick | text |
icon_newwindow | boolean |
title | character varying(300) | not null
no_lead | boolean | not null default false
content_exists | boolean | not null default false
lead | text | not null
content | text | not null default ''::text
_fts_ | tsvector |
Indexes:
"news_pkey" PRIMARY KEY, btree (id)
"news_layout_id_key" UNIQUE, btree (layout_id, news_relation_id)
"forms_index_layout_id_state" btree (layout_id, state)
"ii1" btree (author_id)
"ii2" btree (lastedit_user_id)
"ii3" btree (approved_by)
"ii4" btree (archived_by)
"news_fts" gin (_fts_)
"news_index_date_published" btree (date_published)
"news_index_lastedit" btree (lastedit_date)
"news_index_layout_id" btree (layout_id)
"news_index_layout_id_state" btree (layout_id, state)
"news_index_priority" btree (priority)
"news_visible_from" btree (visible_from)
"news_visible_to" btree (visible_to)

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Grant Johnson 2011-02-04 13:05:33 Re: [HACKERS] Slow count(*) again...
Previous Message Greg Smith 2011-02-04 11:33:48 Re: getting the most of out multi-core systems for repeated complex SELECT statements