Re: How to get the optimizer to use an index with multiple

From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: How to get the optimizer to use an index with multiple
Date: 2003-04-28 19:52:57
Message-ID: Pine.LNX.4.33.0304281347460.13935-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, 28 Apr 2003, Jim C. Nasby wrote:

> select id into temp NewRetires from stats_participant where retire_to>=1
> AND retire_date = (SELECT last_date FROM Project_statsrun WHERE
> project_id = :ProjectID);
>
> results in a table with 5 values...
>
> explain analyze delete from email_rank where project_id=25 and id in
> (select id from NewRetires);
>
> Index Scan using email_rank__day_rank on email_rank
> (cost=0.00..9003741627715.16 rows=45019 width=6) (actual time=408.12..9688.37 rows=3 loops=1)
> Index Cond: (project_id = 25)
> Filter: (subplan)
> SubPlan
> -> Seq Scan on newretires (cost=100000000.00..100000020.00 rows=1000 width=4) (actual time=0.01..0.05 rows=5 loops=91834)
> Total runtime: 9689.86 msec
>
> But, there's already an index that would fit the bill here perfectly:
>
> Table "public.email_rank"
> Column | Type | Modifiers
> -----------------------+---------+--------------------
> project_id | integer | not null
> id | integer | not null
> first_date | date | not null
> last_date | date | not null
> day_rank | integer | not null default 0
> day_rank_previous | integer | not null default 0
> overall_rank | integer | not null default 0
> overall_rank_previous | integer | not null default 0
> work_today | bigint | not null default 0
> work_total | bigint | not null default 0
> Indexes: email_rank_pkey primary key btree (project_id, id),
> email_rank__day_rank btree (project_id, day_rank),
> email_rank__overall_rank btree (project_id, overall_rank)
>
> Why isn't it using email_rank_pkey instead of using day_rank then a
> filter? The original query on sybase (see below) is essentially instant,
> because it's using the index of (project_id, id), so it doesn't have to
> read the whole table.

It looks like the seq scan is newretires up there, from your 'id in
(select id from NewRetires);' part of your query. I.e. the where in() has
to be done first, and the query planner has no stats on that table, so it
assumes a seq scan will be faster in case we need the whole thing anyway.

Try adding an analyze newretires in there between the two queries.

No clue as to why it's choosing one index over the other. I don't think
that really matters a lot, it's the seq scan on the temp table that is
taking your time on this.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jim C. Nasby 2003-04-28 22:54:39 Re: How to get the optimizer to use an index with multiple
Previous Message Jim C. Nasby 2003-04-28 18:32:44 Re: How to get the optimizer to use an index with multiple fields