(Questioning the planner's mind) - was Re: Fastest way to drop an index?

From: Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, PG-General Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: (Questioning the planner's mind) - was Re: Fastest way to drop an index?
Date: 2009-02-04 04:23:59
Message-ID: e373d31e0902032023v76498bd8yb00ab0a1c8caf443@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Feb 4, 2009 at 5:13 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com> writes:
>> Index Scan using new_idx_testimonials_userid on testimonials
>> (cost=0.00..157.78 rows=1 width=9) (actual time=8809.715..8809.715
>> rows=0 loops=1)
>> Index Cond: ((user_id)::text = 'superman'::text)
>> Filter: ((title_encrypted)::text = 'b333dc1b0992cb8c70b58a418211389a'::text)
>> Total runtime: 8809.750 ms
>
> This is using the index to fetch the rows that match user_id = 'superman',
> and then testing each fetched row to see if it has the desired value of
> title_encrypted. The fact that hardly any rows pass the filter test
> tells nearly nothing about how long this should be expected to run.
> The rather high estimated cost suggests that the planner thinks there
> are several dozen rows matching the index condition, and the actual
> runtime suggests that there are actually hundred or thousands of 'em.
> If so, your problem is that you need a different index. I'd bet on an
> index on title_encrypted being more useful for this query than the one
> on user_id; or you could experiment with a two-column index.
>

Thanks Tom. My thinking exactly.

So I have made a two column index on (user_id, title_encrypted)
already. It's done.

But the planner keeps insisting on using the "user_id" as you see from
that EXPLAIN ANALYZE. This was done when the other two-col index
already exists!

Could I force the 2-col index? I googled for "force index postgresql"
and came upon this discussion -
http://archives.postgresql.org/pgsql-sql/2006-02/msg00190.php - which
suggests that the planner may be selecting indexes based on "cost'.

I am not too technically savvy, but I think this means that given the
choice of these two scenarios...

1. Search through "user_id" index, and then limit it by "title_encrypted"
or
2. Search through "user_id, title_encrypted" 2-col index

...the planner decides that it is less resource intensive to go
through the somewhat smaller user_id index and then limit it (i.e.,
scenario 1) than to wade through the bigger second index.

Am I on the right track? If I am, well what's the way around this? How
can I make the planner make use of the 2-col index? Or if my
understanding is not right, why is the scenario 1 being chosen to
begin with?

Thanks for any thoughts! This single query, which used to be much
faster than this, is now slowing down our operations by about 8 second
per query!

Let me know if you need to know any pgsql.conf settings. Only index
related setting I know of are these:

enable_indexscan = on
enable_bitmapscan = off
enable_nestloop = on

Thanks

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Guy Rouillier 2009-02-04 04:55:25 Re: Pet Peeves?
Previous Message Craig Ringer 2009-02-04 04:14:11 Re: Pet Peeves?