estimated rows vs. actual rows

From: Jaime Casanova <systemguards(at)gmail(dot)com>
To: performance pgsql <pgsql-performance(at)postgresql(dot)org>
Subject: estimated rows vs. actual rows
Date: 2005-02-13 21:27:45
Message-ID: c2d9e70e0502131327c95e63d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

in the #postgresql-es channel someone shows me this:

pgsql-7.4.5 + postgis

--- begin context ---

CREATE TABLE calles (
gid int4 NOT NULL DEFAULT nextval('public.callesstgo_gid_seq'::text),
nombre varchar,
inicio int4,
termino int4,
comuna varchar,
ciudad varchar,
region numeric,
pais varchar,
the_geom geometry,
id_comuna numeric,
CONSTRAINT callesstgo_pkey PRIMARY KEY (gid),
CONSTRAINT enforce_geotype_the_geom CHECK (geometrytype(the_geom) =
'MULTILINESTRING'::text OR the_geom IS NULL),
CONSTRAINT enforce_srid_the_geom CHECK (srid(the_geom) = -1)
)
WITH OIDS;
 
CREATE INDEX idx_region_comunas ON calles USING btree
(id_comuna, region);

select count(*) from calles;
143902

--- end context ---
 
Ok . here is the problem (BTW, the database has been analyzed just
before this query was execured)

explain analyze
select * from calles where id_comuna = 92 and region=13;

QUERY PLAN Seq Scan on calles (cost=0.00..7876.53 rows=2610
width=279) (actual time=182.590..454.195 rows=4612 loops=1)
Filter: ((id_comuna = 92::numeric) AND (region = 13::numeric))
Total runtime: 456.876 ms

Why is this query using a seq scan rather than a index scan? i notice
the diff between the estimated rows and actual rows (almost 2000).

Can this affect the query plan? i think this is a problem of
statistics, am i right? if so, what can be done?

regards,
Jaime Casanova

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2005-02-13 21:41:09 Re: estimated rows vs. actual rows
Previous Message Patrick Meylemans 2005-02-13 19:24:22 Re: Benchmark