Skip site navigation (1) Skip section navigation (2)

Re: Inconsistant use of index.

From: "Michael G(dot) Martin" <michael(at)vpmonline(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Ron Mayer <ron(at)intervideo(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: Inconsistant use of index.
Date: 2002-03-26 18:10:10
Message-ID: 3CA0B982.2050503@vpmonline.com (view raw or flat)
Thread:
Lists: pgsql-bugs
I had an issue where my index was not always used on a very large table. 
 The issue came down to the data distribution and not pulling in enough 
of a random sample to get an accurate estimate ( I think the default max 
value was around 3000 sample rows ( 300 * 10 default_samples -- see 
analyze.c ) rows.  I fixed the issue by following Tom's advice and 
increased the statistics count on my table to pull in 300000 rows (1000 
samples *300).  I had to play with the value, re-analyze, and check the 
stats in the pg_stats table until most_common_freqs on some values were 
all fairily close.  The explain plan still shows me a cost and row value 
way above what is physically in the table, but at least my indexes were 
being used.

alter table table_name alter symbol_name set statistics 1000;

--Michael


Tom Lane wrote:

>Ron Mayer <ron(at)intervideo(dot)com> writes:
>
>>  Once some of my tables started getting pretty large, PostgreSQL
>>suddenly stopped using indexes when I use expressions like "col = value"
>>decreasing performance by 20X.
>>
>
>Hmm.  The EXPLAIN shows that the planner is not doing too badly at
>estimating the number of rows involved:
>
>>logs2=# explain analyze select count(*) from fact where dat='2002-03-01';
>>NOTICE:  QUERY PLAN:
>>
>
>>Aggregate  (cost=375631.14..375631.14 rows=1 width=0) (actual
>>time=76689.42..76689.42 rows=1 loops=1)
>>  ->  Seq Scan on fact  (cost=0.00..375101.72 rows=211765 width=0) (actual
>>time=20330.96..76391.94 rows=180295 loops=1)
>>Total runtime: 76707.92 msec
>>
>
>212K estimate for 180K real is not bad at all.  So the problem is in the
>cost models not the initial row count estimation.
>
>If you force an indexscan via "set enable_seqscan to off", what does
>EXPLAIN ANALYZE report?
>
>Also, what do you get from
>	select * from pg_stats where tablename = 'fact';
>I'm particularly interested in the correlation estimate for the dat
>column.  (Would you happen to have an idea whether the data has been
>inserted more-or-less in dat order?)
>
>			regards, tom lane
>
>---------------------------(end of broadcast)---------------------------
>TIP 3: if posting/reading through Usenet, please send an appropriate
>subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
>message can get through to the mailing list cleanly
>

In response to

pgsql-bugs by date

Next:From: Ron MayerDate: 2002-03-26 18:11:43
Subject: Re: Inconsistant use of index.
Previous:From: Michael BeckstetteDate: 2002-03-26 18:03:50
Subject: Re: postmaster crash

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group