Quick Links

Re: How to get the optimizer to use an index with multiple

From:	"scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To:	"Jim C(dot) Nasby" <jim(at)nasby(dot)net>
Cc:	<pgsql-performance(at)postgresql(dot)org>
Subject:	Re: How to get the optimizer to use an index with multiple
Date:	2003-04-28 19:52:57
Message-ID:	Pine.LNX.4.33.0304281347460.13935-100000@css120.ihs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Mon, 28 Apr 2003, Jim C. Nasby wrote:

> select id into temp NewRetires from stats_participant where retire_to>=1
> AND retire_date = (SELECT last_date FROM Project_statsrun WHERE
> project_id = :ProjectID);
>
> results in a table with 5 values...
>
> explain analyze delete from email_rank where project_id=25 and id in
> (select id from NewRetires);
>
> Index Scan using email_rank__day_rank on email_rank
> (cost=0.00..9003741627715.16 rows=45019 width=6) (actual time=408.12..9688.37 rows=3 loops=1)
> Index Cond: (project_id = 25)
> Filter: (subplan)
> SubPlan
> -> Seq Scan on newretires (cost=100000000.00..100000020.00 rows=1000 width=4) (actual time=0.01..0.05 rows=5 loops=91834)
> Total runtime: 9689.86 msec
>
> But, there's already an index that would fit the bill here perfectly:
>
> Table "public.email_rank"
> Column | Type | Modifiers
> -----------------------+---------+--------------------
> project_id | integer | not null
> id | integer | not null
> first_date | date | not null
> last_date | date | not null
> day_rank | integer | not null default 0
> day_rank_previous | integer | not null default 0
> overall_rank | integer | not null default 0
> overall_rank_previous | integer | not null default 0
> work_today | bigint | not null default 0
> work_total | bigint | not null default 0
> Indexes: email_rank_pkey primary key btree (project_id, id),
> email_rank__day_rank btree (project_id, day_rank),
> email_rank__overall_rank btree (project_id, overall_rank)
>
> Why isn't it using email_rank_pkey instead of using day_rank then a
> filter? The original query on sybase (see below) is essentially instant,
> because it's using the index of (project_id, id), so it doesn't have to
> read the whole table.

It looks like the seq scan is newretires up there, from your 'id in
(select id from NewRetires);' part of your query. I.e. the where in() has
to be done first, and the query planner has no stats on that table, so it
assumes a seq scan will be faster in case we need the whole thing anyway.

Try adding an analyze newretires in there between the two queries.

No clue as to why it's choosing one index over the other. I don't think
that really matters a lot, it's the seq scan on the temp table that is
taking your time on this.

In response to

How to get the optimizer to use an index with multiple fields at 2003-04-28 18:21:43 from Jim C. Nasby

Responses

Re: How to get the optimizer to use an index with multiple at 2003-04-28 22:54:39 from Jim C. Nasby

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Jim C. Nasby	2003-04-28 22:54:39	Re: How to get the optimizer to use an index with multiple
Previous Message	Jim C. Nasby	2003-04-28 18:32:44	Re: How to get the optimizer to use an index with multiple fields