Re: Query Optimization

From: Zach Calvert <zachcalvert(at)hemerasoftware(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Sean Davis <sdavis2(at)mail(dot)nih(dot)gov>, Luiz Eduardo Cantanhede Neri <lecneri(at)gmail(dot)com>, pgsql-novice(at)postgresql(dot)org, Dan Scott <balancedtree(at)gmail(dot)com>
Subject: Re: Query Optimization
Date: 2009-05-27 16:02:46
Message-ID: 89af81ab0905270902r29c8e2s19f07a9626cb4cc7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

I've added the leaderboardid, score index and it still does the
sequential scan. I'm going to try inserting a few hundred thousand
rows and seeing if it switches to index scan.

On Wed, May 27, 2009 at 10:34 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Sean Davis <sdavis2(at)mail(dot)nih(dot)gov> writes:
>>> zachcalvert(at)hemerasoftware(dot)com> wrote:
>>>> I have a query and I have run
>>>> explain analyze
>>>> select count(*)
>>>> from score
>>>> where leaderboardid=35 and score <= 6841 and active
>>>>
>>>> The result is
>>>> "Aggregate  (cost=2491.06..2491.07 rows=1 width=0) (actual
>>>> time=38.878..38.878 rows=1 loops=1)"
>>>> "  ->  Seq Scan on score  (cost=0.00..2391.17 rows=39954 width=0)
>>>> (actual time=0.012..30.760 rows=38571 loops=1)"
>>>> "        Filter: (active AND (score <= 6841) AND (leaderboardid = 35))"
>>>> "Total runtime: 38.937 ms"
>>>>
>>>> I have an index on score, I have an index on score, leaderboardid, and
>>>> active and still it does a sequential scan.
>
>> Postgresql is aware of the "cost" associated with each query.  In the case
>> of a small table with an index that is not very discriminative, it may
>> choose a sequential scan.  However, as you add more rows, the index scan may
>> become more effective and may be used instead.  One thing to keep in mind is
>> that an index scan is NOT always faster than a sequential scan.
>
> A crude rule of thumb is that you need the query to fetch less than
> ten percent of the rows before a bitmap scan is going to be a win,
> and less than one percent before a plain indexscan is going to be a win.
> (If your database is entirely cached in memory then the crossover
> percentages are higher, and you need to adjust the planner's cost
> parameters so that it gets this right.)  It's not clear exactly how
> big this table is, but I'm betting the query is fetching more than
> ten percent of it.
>
> One point worth making is that if this is the typical set of conditions
> in your queries, then the best index would be one on (leaderboardid,
> score) not (score, leaderboardid, active).  (I'm betting that the
> condition active = true is so nonselective it's not worth keeping it in
> the index at all.)  You want equality conditions on the leading
> column(s) and inequalities on the trailing columns.  To see why this is,
> think about the index sort ordering and the portion of the index that
> the query will have to scan.  In the latter case the set of index
> entries matching this query is a contiguous group; in the former, not.
>
> Our fine manual has a reasonable amount of detail about proper index
> design:
> http://www.postgresql.org/docs/8.3/static/indexes.html
>
>                        regards, tom lane
>

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Robert Schnabel 2009-05-27 19:03:33 table check constraint
Previous Message Thomas Kellerer 2009-05-27 16:00:39 Re: Tool for modeling