Skip site navigation (1) Skip section navigation (2)

BUG #5885: Strange rows estimation for left join

From: "Maxim Boguk" <Maxim(dot)Boguk(at)gmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #5885: Strange rows estimation for left join
Date: 2011-02-15 02:43:17
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-bugs
The following bug has been logged online:

Bug reference:      5885
Logged by:          Maxim Boguk
Email address:      Maxim(dot)Boguk(at)gmail(dot)com
PostgreSQL version: 8.4.4
Operating system:   Linux
Description:        Strange rows estimation for left join

I found that strange effect while helping with slow query on russian
postgresql online forum.

To be short results looks like:

explain analyze
select *
from references rs
left join references vm on vm.reference = rs.reference and vm.attr_id =
where rs.object_id = 9129863193713091717;

"Nested Loop Left Join  (cost=0.00..1567.71 rows=2129718 width=112) (actual
time=14.654..14.672 rows=2 loops=1)"
"  ->  Index Scan using xif02references on references rs  (cost=0.00..19.85
rows=11 width=56) (actual time=0.019..0.020 rows=2 loops=1)"
"        Index Cond: (object_id = 9129863193713091717::numeric)"
"  ->  Index Scan using xif01references on references vm  (cost=0.00..140.03
rows=55 width=56) (actual time=7.321..7.321 rows=0 loops=2)"
"        Index Cond: ((vm.reference = rs.reference) AND (vm.attr_id =
"Total runtime: 14.758 ms"

E.g. 6 orders of magnitude error in result rows selectivity (2129718 vs 2).

Table description:

CREATE TABLE references
  attr_id numeric(20,0) NOT NULL,
  reference numeric(20,0) NOT NULL,
  object_id numeric(20,0) NOT NULL

CREATE INDEX xif01references
  ON references
  USING btree
  (reference, attr_id);

CREATE INDEX xif02references
  ON references
  USING btree
  (object_id, attr_id, reference);

Yes again unfortunate EAV model.

Related data from pg_stats

"schemaname"       	"public"
"tablename"        	"references"

"attname"          	"attr_id"
"null_frac"        	0
"avg_width"        	10
"n_distinct"       	23
"most_common_freqs"	"{0.234067;0.233433;0.227433;0.0998333;0.0476667;0.0449;

d"attname"          	"reference"
"null_frac"        	0
"avg_width"        	14
"n_distinct"       	56252
"most_common_freqs"	"{0.2497;0.0138333;0.0002;0.000166667;0.000166667;...}"

"attname"          	"object_id"
"null_frac"        	0
"avg_width"        	15
"n_distinct"       	1.23744e+06
"most_common_freqs"	"{0.0003;0.000266667;0.000266667;0.000233333;0.0002;0.00

What seems very strange for me is final esctimation 2M when estimated 11
rows on right side of the join and 55 rows on left side of the join (so
final rows must be 55*11 ~ 600 rows even with worst case of left join

May be that effect already fixed in 8.4.5+ (but I had readed changes list
since 8.4.4 and I don't find any related changes).

PS: Sorry if I completely missunderstood of planner mechanics in left join
rows estimation.


pgsql-bugs by date

Next:From: Tom LaneDate: 2011-02-15 03:27:31
Subject: Re: BUG #5885: Strange rows estimation for left join
Previous:From: Pavel StehuleDate: 2011-02-14 17:43:47
Subject: Re: BUG #5884: i cant select entire array values

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group