Re: optimizer choosing the wrong index

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Martin Below <machtin(dot)below(at)googlemail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: optimizer choosing the wrong index
Date: 2010-07-07 13:21:32
Message-ID: AANLkTilH8gU5-YD2OJhtLyaBUlePltAl7md8niv8PZA6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Jul 7, 2010 at 7:42 AM, Martin Below
<machtin(dot)below(at)googlemail(dot)com> wrote:
> Hello,
>
> I'm facing a strange problem where the optimizer does pick the wrong index.
> Im using postgres 8.4, and my schema look like this:
>
>  client_id   | character varying(36)       | not null
>  key         | character varying(16)       | not null
>  expires_on  | timestamp without time zone | not null
>
> Indexe:
>    "ps_pkey" PRIMARY KEY, btree (client_id, key)
>    "idx_correct" btree (client_id, expires_on)
>    "idx_wrong" btree (expires_on)
>
>
> the query:
> explain analyze select * from ps where client_id='foo' and expires_on
> = timestamp '2010-11-24';
>
>                                                  QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------
>  Index Scan using idx_wrong on ps  (cost=0.00..8.29 rows=1 width=61)
> (actual time=0.010..0.010 rows=0 loops=1)
>   Index Cond: (expires_on = '2010-11-24 00:00:00'::timestamp without time zone)
>   Filter: ((client_id)::text = 'foo'::text)
>  Total runtime: 0.089 ms
>
>
> Why is "idx_wrong" used (which only includes one of the fields
> queried) instead of idx_correct (which contains both fields)?
> If I drop idx_wrong, the correct index is choosen:
>
> test=# explain analyze select * from ps where client_id='foo' and
> expires_on = timestamp '2010-11-24';
>
>                                                       QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------
>  Index Scan using idx_correct on ps  (cost=0.00..8.34 rows=1 width=53)
> (actual time=0.023..0.023 rows=0 loops=1)
>   Index Cond: (((client_id)::text = 'foo'::text) AND (expires_on =
> '2010-11-24 00:00:00'::timestamp without time zone))
>  Total runtime: 0.058 ms
>
>
>
> The problem seems to me that the estimates costs are not correct.
> With the table containing about 200.000 records, using the "wrong"
> index takes about 22 times as long as using the "right" index. I did
> run "vacuum analyze", without any effect.
>
> Any help would be very much appreciated.

can you supply the plans on the actual tables? the 'wrong' index
might actually be the 'right' one if expires_on is of high cardinality
(perhaps it's distributed badly and the table needs a stats tweak to
make it correct).

btw, consider using 'date' type for dates vs non timezone timestamp,
which is a bit of a kludge imo.

You can probably force the right index like this:
explain analyze select * from ps where (client_id, expires_on) =
('foo', '2010-11-24'::timestamp);

merlin

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Merlin Moncure 2010-07-07 13:27:49 Re: make view with union return one record
Previous Message Pavel Stehule 2010-07-07 13:20:24 Re: TupleDesc and HeapTuple