Re: select query takes 13 seconds to run with index

From: mark <markkicks(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org, david(dot)t(dot)wilson(at)gmail(dot)com, Gregory(dot)Williamson(at)digitalglobe(dot)com, justin(at)emproshunts(dot)com
Subject: Re: select query takes 13 seconds to run with index
Date: 2008-05-27 03:12:26
Message-ID: 82fa9e310805262012p216dc4acv6fa483ed118b1e9e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

On Mon, May 26, 2008 at 7:58 PM, mark <markkicks(at)gmail(dot)com> wrote:
> On Mon, May 26, 2008 at 5:36 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> mark <markkicks(at)gmail(dot)com> writes:
>>> EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
>>> DESC limit 6;
>>> QUERY PLAN
>>> ----------------------------------------------------------------------------------------------------------------------------------------------------
>>> Limit (cost=0.00..9329.02 rows=6 width=135) (actual
>>> time=13612.247..13612.247 rows=0 loops=1)
>>> -> Index Scan Backward using pokes_pkey on pokes
>>> (cost=0.00..5182270.69 rows=3333 width=135) (actual
>>> time=13612.245..13612.245 rows=0 loops=1)
>>> Filter: (uid = 578439028)
>>> Total runtime: 13612.369 ms
>>> (4 rows)
>>
>> The problem is the vast disconnect between the estimated and actual
>> rowcounts for the indexscan (3333 vs 0). The planner thinks there
>> are three thousand rows matching uid = 578439028, and that encourages
>> it to try a plan that's only going to be fast if at least six such
>> rows show up fairly soon while scanning the index in reverse id order.
>> What you really want it to do here is scan on the uid index and then
>> sort the result by id ... but that will be slow in exactly the case
>> where this plan is fast, ie, when there are a lot of matching uids.
>>
>> Bottom line: the planner cannot make the right choice between these
>> alternatives unless it's got decent statistics about the frequency
>> of uid values. "I analyzed the table about a week ago" is not good
>> enough maintenance policy --- you need current stats, and you might need
>> to bump up the statistics target to get enough data about less-common
>> values of uid.
> how do i do this? bump up the statistics target?
>
>> (Since it's 8.3, the autovac daemon might have been analyzing for you,
>> if you didn't turn off autovacuum. In that case increasing the
>> statistics target is the first thing to try.)
> i did not turn it off..
> and my OS is fedora 9
>
> i ran vacuum verbose analyze pokes, and then ran the same query, and
> there is no improvement..
>
> EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id limit 6;
> QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=0.00..8446.80 rows=6 width=130) (actual
> time=12262.779..12262.779 rows=0 loops=1)
> -> Index Scan using pokes_pkey on pokes (cost=0.00..5149730.49
> rows=3658 width=130) (actual time=12262.777..12262.777 rows=0 loops=1)
> Filter: (uid = 578439028)
> Total runtime: 12262.817 ms

OK I did this

ALTER TABLE pokes ALTER uid set statistics 500;
ALTER TABLE

ANALYZE pokes;
ANALYZE

and then it became super fast!! thanks a lot!!!
my question:
-> is 500 too high? what all does this affect?
-> now increasing this number does it affect only when i am running
analyze commands, or will it slow down inserts and other operations?
EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
desc limit 6;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=467.80..467.81 rows=6 width=134) (actual
time=0.016..0.016 rows=0 loops=1)
-> Sort (cost=467.80..468.09 rows=117 width=134) (actual
time=0.016..0.016 rows=0 loops=1)
Sort Key: id
Sort Method: quicksort Memory: 25kB
-> Index Scan using idx_pokes_uid on pokes
(cost=0.00..465.70 rows=117 width=134) (actual time=0.011..0.011
rows=0 loops=1)
Index Cond: (uid = 578439028)
Total runtime: 0.037 ms

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Webb Sprague 2008-05-27 04:11:20 Re: Populating a sparse array piecemeal in plpgsql (REDUX)
Previous Message mark 2008-05-27 02:58:51 Re: select query takes 13 seconds to run with index

Browse pgsql-performance by date

  From Date Subject
Next Message PFC 2008-05-27 08:19:37 Re: [PERFORM] select query takes 13 seconds to run with index
Previous Message mark 2008-05-27 02:58:51 Re: select query takes 13 seconds to run with index