Planner is getting wrong row count

From: AI Rumman <rummandba(at)gmail(dot)com>
To: postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Planner is getting wrong row count
Date: 2013-04-02 18:57:45
Message-ID: CAGoODpdTE__dLDhJe2d_4dwZj0YOx1vp7O-Lrwj2pYtxdMeBKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Why is the following query getting wrong estimation of rows?
I am using Postgresql 9.2.1 with default_statistics_target = 100.
I execute vacuum analyze each night.

explain analyze
SELECT
entity.id AS "Leads_id", entity.type AS "Leads_type" ,
leads.firstname AS "Leads_firstname",
leads.lastname AS "Leads_lastname"
FROM leads
INNER JOIN entity ON leads.leadid=entity.id
LEFT JOIN groups ON groups.groupid = entity.smownerid
LEFT join users ON entity.smownerid= users.id
WHERE entity.type='Leads' AND entity.deleted=0 AND leads.converted=0
Hash Join (cost=14067.90..28066.53 rows=90379 width=26) (actual
time=536.009..1772.910 rows=337139 loops=1)
Hash Cond: (leads.leadid = entity.id)
-> Seq Scan on leads (cost=0.00..7764.83 rows=533002 width=18) (actual
time=0.008..429.576 rows=532960 loops=1)
Filter: (converted = 0)
-> Hash (cost=9406.25..9406.25 rows=372932 width=16) (actual
time=535.800..535.800 rows=342369 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 16049kB
-> Index Scan using entity_type_idx on entity
(cost=0.00..9406.25 rows=372932 width=16) (actual time=0.030..305.250
rows=342369 loops=1)
Index Cond: ((type)::text = 'Leads'::text)

\d leads
Table "public.leads"
Column | Type | Modifiers

------------------+------------------------+---------------------------------------
leadid | integer | not null
email | character varying(100) |
interest | character varying(50) |
firstname | character varying(100) |
salutation | character varying(200) |
lastname | character varying(100) | not null
company | character varying(200) | not null
annualrevenue | integer | default 0
industry | character varying(200) |
campaign | character varying(30) |
rating | character varying(200) |
leadstatus | character varying(50) |
leadsource | character varying(200) |
converted | integer | default 0
designation | character varying(200) | default 'SalesMan'::character
varying
licencekeystatus | character varying(50) |
space | character varying(250) |
comments | text |
priority | character varying(50) |
demorequest | character varying(50) |
partnercontact | character varying(50) |
productversion | character varying(20) |
product | character varying(50) |
maildate | date |
nextstepdate | date |
fundingsituation | character varying(50) |
purpose | character varying(50) |
evaluationstatus | character varying(50) |
transferdate | date |
revenuetype | character varying(50) |
noofemployees | integer |
yahooid | character varying(100) |
assignleadchk | integer | default 0
department | character varying(200) |
emailoptout | character varying(3) | default 0
siccode | character varying(50) |
Indexes:
"leads_pkey" PRIMARY KEY, btree (leadid)
"ftx_en_leads_company" gin (to_tsvector('v_en'::regconfig,
for_fts(company::text)))
"ftx_en_leads_email" gin (to_tsvector('v_en'::regconfig,
for_fts(email::text)))
"ftx_en_leads_emailoptout" gin (to_tsvector('v_en'::regconfig,
for_fts(emailoptout::text)))
"ftx_en_leads_firstname" gin (to_tsvector('v_en'::regconfig,
for_fts(firstname::text)))
"ftx_en_leads_lastname" gin (to_tsvector('v_en'::regconfig,
for_fts(lastname::text)))
"ftx_en_leads_yahooid" gin (to_tsvector('v_en'::regconfig,
for_fts(yahooid::text)))
"leads_converted_idx" btree (converted)
"leads_leadsource_idx" btree (leadsource)
"leads_leadstatus_idx" btree (leadstatus)

\d entity
Table "public.entity"
Column | Type | Modifiers

--------------------+-----------------------------+------------------------------
id | integer | not null
smcreatorid | integer | not null default 0
smownerid | integer | not null default 0
modifiedby | integer | not null default 0
setype | character varying(30) | not null
description | text |
createdtime | timestamp without time zone | not null
modifiedtime | timestamp without time zone | not null
viewedtime | timestamp without time zone |
status | character varying(50) |
version | integer | not null default 0
presence | integer | default 1
deleted | integer | not null default 0
owner_type | character(1) | not null default
'U'::bpchar
last_activity_date | timestamp without time zone |
Indexes:
"entity_pkey" PRIMARY KEY, btree (id)
"entity_createdtime_idx" btree (createdtime)
"entity_modifiedby_idx" btree (modifiedby)
"entity_modifiedtime_idx" btree (modifiedtime)
"entity_setype_idx" btree (setype) WHERE deleted = 0
"entity_smcreatorid_idx" btree (smcreatorid)
"entity_smownerid_idx" btree (smownerid)
"ftx_en_entity_description" gin (to_tsvector('v_en'::regconfig,
for_fts(description)))
"entity_deleted_idx" btree (deleted)
Referenced by:
TABLE "service" CONSTRAINT "fk_1_service" FOREIGN KEY (serviceid)
REFERENCES entity(id) ON DELETE CASCADE
TABLE "servicecontracts" CONSTRAINT "fk_1_servicecontracts" FOREIGN KEY
(servicecontractsid) REFERENCES entity(id) ON DELETE CASCADE
TABLE "_emails" CONSTRAINT "fk__emails_id" FOREIGN KEY (id) REFERENCES
entity(id) ON DELETE CASCADE

Please advice.

Thanks.

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Dieter Rehbein 2013-04-03 08:18:55 Re: Join between 2 tables always executes a sequential scan on the larger table
Previous Message Armand du Plessis 2013-04-02 18:16:03 Re: Problems with pg_locks explosion