Skip site navigation (1) Skip section navigation (2)

Re: Inconsistant use of index.

From: Ron Mayer <ron(at)intervideo(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Inconsistant use of index.
Date: 2002-03-26 18:11:43
Message-ID: Pine.LNX.4.33.0203260944070.16667-200000@ron (view raw or flat)
Thread:
Lists: pgsql-bugs
On Tue, 26 Mar 2002, Tom Lane wrote:
>
> Ron Mayer <ron(at)intervideo(dot)com> writes:
> > [...] pretty large, PostgreSQL suddenly stopped using indexes [...]
> [...]
>
> 212K estimate for 180K real is not bad at all.  So the problem is in the
> cost models not the initial row count estimation.
>
> If you force an indexscan via "set enable_seqscan to off", what does
> EXPLAIN ANALYZE report?

It then uses the index:

===================================================================
== logs2=# set enable_seqscan to off;
== SET VARIABLE
== logs2=# explain analyze select count(*) from fact where dat='2002-03-01';
==
== NOTICE:  QUERY PLAN:
==
== Aggregate  (cost=840488.03..840488.03 rows=1 width=0) (actual
== time=2753.82..2753.82 rows=1 loops=1)
==   ->  Index Scan using i_fact__dat on fact  (cost=0.00..839957.59 rows=212174
== width=0) (actual time=101.25..2434.00 rows=180295 loops=1)
== Total runtime: 2754.24 msec
===================================================================


> Also, what do you get from
> 	select * from pg_stats where tablename = 'fact';
> I'm particularly interested in the correlation estimate for the dat
> column.  (Would you happen to have an idea whether the data has been
> inserted more-or-less in dat order?)

I've attached that output as an attachment.

I beleve much of February was loaded first, then we back-filled January,
and daily I've been adding March's results.  I don't believe the index-usage
stopped when we did the january fill... something happend a few days ago after
a pretty routine daily load.




Oh... one more interesting thing...

There are a couple big exceptions to the even distribution of data.

Almost every day has between 190000 and 270000 records except '2002-03-08'
which has 404293 records and '2002-03-25' which has 6 records.

For that particular day, the "<= ... >=" trick doesn't work either.

===================================================================
==logs2=# explain select count(*) from fact where dat<='2002-03-08' and
==dat>='2002-03-08';
==NOTICE:  QUERY PLAN:
==
==Aggregate  (cost=422125.92..422125.92 rows=1 width=0)
==  ->  Seq Scan on fact  (cost=0.00..421128.67 rows=398900 width=0)
==
==EXPLAIN
==logs2=#
==logs2=# explain select count(*) from fact where dat<='2002-03-07' and
==dat>='2002-03-07';
==NOTICE:  QUERY PLAN:
==
==Aggregate  (cost=6.00..6.00 rows=1 width=0)
==  ->  Index Scan using i_fact__dat on fact  (cost=0.00..5.99 rows=1 width=0)
==
==EXPLAIN
===================================================================

I also believe that may have been the day when the index stopped
working for "=" for all dates.

  Ron


Attachment: stats
Description: text/plain (3.3 KB)

In response to

Responses

pgsql-bugs by date

Next:From: Matt PetersonDate: 2002-03-26 19:32:31
Subject: SEGV in contrib/array/array_iterator.c
Previous:From: Michael G. MartinDate: 2002-03-26 18:10:10
Subject: Re: Inconsistant use of index.

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group