Re: Planner incorrectly choosing seq scan over index scan

From: John Arbash Meinel <john(at)arbash-meinel(dot)com>
To: meetesh(dot)karia(at)alumni(dot)duke(dot)edu
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Planner incorrectly choosing seq scan over index scan
Date: 2005-08-01 23:16:27
Message-ID: 42EEAD4B.4010905@arbash-meinel.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Meetesh Karia wrote:
> Hi all,
>
> We're using 8.0.3 and we're seeing a problem where the planner is
> choosing a seq scan and hash join over an index scan. If I set
> enable_hashjoin to off, then I get the plan I'm expecting and the query
> runs a lot faster. I've also tried lowering the random page cost (even
> to 1) but the planner still chooses to use the hash join.
>
> Does anyone have any thoughts/suggestions? I saw that there was a
> thread recently in which the planner wasn't correctly estimating the
> cost for queries using LIMIT. Is it possible that something similar is
> happening here (perhaps because of the sort) and that the patch Tom
> proposed would fix it?
>
> Thanks. Here are the various queries and plans:
>
> Normal settings

...

> QUERY PLAN
> Sort (cost=13430.57..13439.24 rows=3467 width=48) (actual
> time=1390.000..1390.000 rows=3467 loops=1)
> Sort Key: c.sourceid, c.targetid
> -> Merge Join (cost=9912.07..13226.72 rows=3467 width=48) (actual
> time=1344.000..1375.000 rows=3467 loops=1)
> Merge Cond: ("outer".user_id = "inner".sourceid)
> -> Index Scan using lte_user_pkey on lte_user s
> (cost=0.00..16837.71 rows=279395 width=16) (actual time=0.000..95.000
> rows=50034 loops=1)

This is where the planner is messing up, and mis-estimating the
selectivity. It is expecting to get 280k rows, but only needs to get 50k.
I assume lte_user is the bigger table, and that candidates617004 has
some subset.

Has lte_user and candidates617004 been recently ANALYZEd? All estimates,
except for the expected number of rows from lte_user seem to be okay.

Is user_id the primary key for lte_user?
I'm trying to figure out how you can get 50k rows, by searching a
primary key, against a 3.5k rows. Is user_id only part of the primary
key for lte_user?

Can you give us the output of:
\d lte_user
\d candidates617004

So that we have the description of the tables, and what indexes you have
defined?

Also, if you could describe the table layouts, that would help.

John
=:->

> -> Sort (cost=9912.07..9920.73 rows=3467 width=40) (actual
> time=1156.000..1156.000 rows=3467 loops=1)
> Sort Key: c.sourceid
> -> Hash Join (cost=8710.44..9708.21 rows=3467 width=40)
> (actual time=1125.000..1156.000 rows=3467 loops=1)
> Hash Cond: ("outer".targetid = "inner".user_id)
> -> Seq Scan on candidates617004 c
> (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..0.000
> rows=3467 loops=1)
> -> Hash (cost=8011.95..8011.95 rows=279395
> width=16) (actual time=1125.000..1125.000 rows=0 loops=1)
> -> Seq Scan on lte_user t
> (cost=0.00..8011.95 rows=279395 width=16) (actual time=0.000..670.000
> rows=279395 loops=1)
> Total runtime: 1406.000 ms
>
> enable_hashjoin disabled
> ----------------------------------------
> QUERY PLAN
> Sort (cost=14355.37..14364.03 rows=3467 width=48) (actual
> time=391.000..391.000 rows=3467 loops=1)
> Sort Key: c.sourceid, c.targetid
> -> Nested Loop (cost=271.52..14151.51 rows=3467 width=48) (actual
> time=203.000..359.000 rows=3467 loops=1)
> -> Merge Join (cost=271.52..3490.83 rows=3467 width=40)
> (actual time=203.000..218.000 rows=3467 loops=1)
> Merge Cond: ("outer".user_id = "inner".sourceid)
> -> Index Scan using lte_user_pkey on lte_user s
> (cost=0.00..16837.71 rows=279395 width=16) (actual time=0.000..126.000
> rows=50034 loops=1)
> -> Sort (cost=271.52..280.19 rows=3467 width=32) (actual
> time=15.000..30.000 rows=3467 loops=1)
> Sort Key: c.sourceid
> -> Seq Scan on candidates617004 c
> (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..0.000
> rows=3467 loops=1)
> -> Index Scan using lte_user_pkey on lte_user t
> (cost=0.00..3.03 rows=1 width=16) (actual time=0.031..0.036 rows=1
> loops=3467)
> Index Cond: ("outer".targetid = t.user_id)
> Total runtime: 406.000 ms
>
> random_page_cost set to 1.5
> ----------------------------------------------
> QUERY PLAN
> Sort (cost=12702.62..12711.29 rows=3467 width=48) (actual
> time=1407.000..1407.000 rows=3467 loops=1)
> Sort Key: c.sourceid, c.targetid
> -> Merge Join (cost=9912.07..12498.77 rows=3467 width=48) (actual
> time=1391.000..1407.000 rows=3467 loops=1)
> Merge Cond: ("outer".user_id = "inner".sourceid)
> -> Index Scan using lte_user_pkey on lte_user s
> (cost=0.00..12807.34 rows=279395 width=16) (actual time=0.000..46.000
> rows=50034 loops=1)
> -> Sort (cost=9912.07..9920.73 rows=3467 width=40) (actual
> time=1188.000..1188.000 rows=3467 loops=1)
> Sort Key: c.sourceid
> -> Hash Join (cost=8710.44..9708.21 rows=3467 width=40)
> (actual time=1157.000..1188.000 rows=3467 loops=1)
> Hash Cond: ("outer".targetid = "inner".user_id)
> -> Seq Scan on candidates617004 c
> (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..15.000
> rows=3467 loops=1)
> -> Hash (cost=8011.95..8011.95 rows=279395
> width=16) (actual time=1157.000..1157.000 rows=0 loops=1)
> -> Seq Scan on lte_user t
> (cost=0.00..8011.95 rows=279395 width=16) (actual time=0.000..750.000
> rows=279395 loops=1)
> Total runtime: 1422.000 ms
>
> random_page_cost set to 1.5 and enable_hashjoin set to false
> --------------------------------------------------------------------------------------------------
> QUERY PLAN
> Sort (cost=13565.58..13574.25 rows=3467 width=48) (actual
> time=390.000..390.000 rows=3467 loops=1)
> Sort Key: c.sourceid, c.targetid
> -> Nested Loop (cost=271.52..13361.73 rows=3467 width=48) (actual
> time=203.000..360.000 rows=3467 loops=1)
> -> Merge Join (cost=271.52..2762.88 rows=3467 width=40)
> (actual time=203.000..250.000 rows=3467 loops=1)
> Merge Cond: ("outer".user_id = "inner".sourceid)
> -> Index Scan using lte_user_pkey on lte_user s
> (cost=0.00..12807.34 rows=279395 width=16) (actual time=0.000..48.000
> rows=50034 loops=1)
> -> Sort (cost=271.52..280.19 rows=3467 width=32) (actual
> time=15.000..31.000 rows=3467 loops=1)
> Sort Key: c.sourceid
> -> Seq Scan on candidates617004 c
> (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..15.000
> rows=3467 loops=1)
> -> Index Scan using lte_user_pkey on lte_user t
> (cost=0.00..3.02 rows=1 width=16) (actual time=0.023..0.023 rows=1
> loops=3467)
> Index Cond: ("outer".targetid = t.user_id)
> Total runtime: 406.000 ms
>
> Thanks,
> Meetesh

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Meetesh Karia 2005-08-01 23:30:26 Re: Planner incorrectly choosing seq scan over index scan
Previous Message Tobias Brox 2005-08-01 22:37:08 Re: Planner incorrectly choosing seq scan over index scan