Re: Optimizer Selecting Incorrect Index

From: Richard Huxton <dev(at)archonet(dot)com>
To: David Price <dprice(at)dentfirst(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Optimizer Selecting Incorrect Index
Date: 2004-08-25 15:47:19
Message-ID: 412CB487.6060704@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

David Price wrote:
> I have 2 servers both with the exact same data, the same O.S., the same
> version of Postgres (7.4.5) and the exact same db schema's (one production
> server, one development server). One server is using the correct index for
> SQL queries resulting in extremely slow performance, the other server is
> properly selecting the index to use and performance is many times better. I
> have tried vacuum, but that did not work. I finally resorted to dumping the
> data, removing the database completely, creating a new database and
> importing the data only to have to problem resurface. The table has
> 5,000,000+ rows on both the systems.
>
> When I run 'analyze verbose' on the correctly working system, the following
> is displayed:

EXPLAIN ANALYZE is usually considered enough

> Index Scan using trptserc on trans (cost=0.00..465.10 rows=44 width=118)
> Index Cond: (trn_patno = 199999)
> Filter: ((trn_old_date >= '1994-08-23'::date) AND (trn_old_date <=
> '2004-08-23'::date) AND (trn_bill_inc = 'B'::bpchar))
> (687 rows)
>
>
> Now, when I run 'analyze verbose' on the INCORRECTLY working system, the
> following is displayed:

> Index Scan using todate on trans (cost=0.00..105165.74 rows=1 width=143)
> Index Cond: ((trn_old_date >= '1994-08-23'::date) AND (trn_old_date <=
> '2004-08-23'::date))
> Filter: ((trn_patno = 199999) AND (trn_bill_inc = 'B'::bpchar))
> (713 rows)

These queries are different. The first returns 687 rows and the second
713 rows. You need to check your systems if they are supposed to be
identical.

Things to check:
1. postgresql.conf settings match - different costs could cause this
2. statistics on the two columns (trn_patno,trn_old_date) - if they
differ considerably between systems that would also explain it.

I suspect the second one, at a wild guess the working system happens to
know 199999 is fairly rare wheras the second just estimates an average.

If the stats don't help, people are going to want to see the entire
query+plan. Could you repost with the query and explain analyse on both
system. Oh, and some idea on how many rows/unique values are involved in
the important columns.

--
Richard Huxton
Archonet Ltd

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2004-08-25 16:59:15 Re: What is the best way to do attribute/values?
Previous Message David Price 2004-08-25 15:07:27 Optimizer Selecting Incorrect Index