Re: Improve Query

From: Zach Calvert <zachcalvert(at)hemerasoftware(dot)com>
To: Nikolas Everett <nik9000(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org, Dan Scott <balancedtree(at)gmail(dot)com>
Subject: Re: Improve Query
Date: 2009-05-27 16:25:18
Message-ID: 89af81ab0905270925r23e381e5tc6395c9d215f6ae4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I'm running the inserts now via a JDBC call I have, which is then
followed up by the query I'm showing and a few others. I have run
tests on all of the others, and all others run index scans and are
very fast, 10 ms or less. This one started at 2 milliseconds when the
table is empty and is up to 40 milliseconds with 40K inserts. It is
degrading fast and I can't imagine what will happen with 400K, let
alone 400 million.

It is getting slower at a fairly fast clip and I need it to remain
fast. Does postgre just not do count(*) with index scans? Is that my
problem?

I'm still running the exact same query. Here are the indexes I have tried
CREATE INDEX idx_score_score
ON score
USING btree
(score);

CREATE INDEX idx_score_ldbscore
ON score
USING btree
(leaderboardid, score);

CREATE INDEX idx_score_ldbactive
ON score
USING btree
(leaderboardid, active);

CREATE INDEX idx_score_ldbactivescore
ON score
USING btree
(leaderboardid, active, score);

CREATE INDEX idx_score_scoreactiveldb
ON score
USING btree
(score, active, leaderboardid);

Yet still I run
explain analyze
select count(*)
from score
where leaderboardid=35 and active and score <= 6841

and get
"Aggregate (cost=2641.29..2641.30 rows=1 width=0) (actual
time=134.826..134.826 rows=1 loops=1)"
" -> Seq Scan on score (cost=0.00..2536.44 rows=41938 width=0)
(actual time=0.011..126.250 rows=40918 loops=1)"
" Filter: (active AND (score <= 6841) AND (leaderboardid = 35))"
"Total runtime: 48.891 ms"

On Wed, May 27, 2009 at 11:06 AM, Nikolas Everett <nik9000(at)gmail(dot)com> wrote:
> The plan ought to be different when there are more scores and the table is
> analyzed and your statistics target is high enough.  At this point you don't
> have enough data to merit doing anything but a seq scan.  The overhead is
> simply not worth it.
>
> You could try inserting a lot more rows.  I'd create a function to do
> several million inserts with random numbers and then analyze and rerun.
>
> I think in the end your probably going to see a couple of bitmap index
> scans, anding the results together, and then a bitmap scan.
>
> Keep in mind that postgres stores statistics information about each column,
> but doesn't at this point store statistics about two columns together.
>
> Preparing the query might actually hurt performance because postgres treats
> a prepare as "plan this query but I'm not going to tell you value of the
> parameters".  If you actually let the query replan every time then you will
> get a different plan for the leaderboards or score ranges that are more
> popular.
>
> On Wed, May 27, 2009 at 8:09 AM, Zach Calvert
> <zachcalvert(at)hemerasoftware(dot)com> wrote:
>>
>> So Google hasn't been helpful and I'm not entirely sure what  to look
>> for in the mailing lists to find the answer to my problem, so here
>> goes.
>>
>> I have a query and I have run
>> explain analyze
>> select count(*)
>> from score
>> where leaderboardid=35 and score <= 6841 and active
>>
>> The result is
>> "Aggregate  (cost=2491.06..2491.07 rows=1 width=0) (actual
>> time=38.878..38.878 rows=1 loops=1)"
>> "  ->  Seq Scan on score  (cost=0.00..2391.17 rows=39954 width=0)
>> (actual time=0.012..30.760 rows=38571 loops=1)"
>> "        Filter: (active AND (score <= 6841) AND (leaderboardid = 35))"
>> "Total runtime: 38.937 ms"
>>
>> I have an index on score, I have an index on score and leaderboard and
>> active.  I can't seem to figure out how to create an index that will
>> turn that "Seq Scan" into an index scan. The biggest problem is that
>> the query degrades very quickly with a lot more rows and I will be
>> getting A LOT MORE rows.  What can I do to improve the performance of
>> this query?
>>
>>
>>
>>
>>
>> Thank you so much,
>> ZC
>>
>> --
>> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Grzegorz Jaśkiewicz 2009-05-27 16:32:50 Re: Improve Query
Previous Message Nikolas Everett 2009-05-27 16:06:38 Re: Improve Query