Re: Slow query: table iteration (8.3)

From: Yeb Havinga <yebhavinga(at)gmail(dot)com>
To: Glenn Maynard <glenn(at)zewt(dot)org>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow query: table iteration (8.3)
Date: 2010-02-01 14:54:47
Message-ID: 4B66EB37.9070900@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Glenn Maynard wrote:
> Hitting a performance issues that I'm not sure how to diagnose.
>
> SELECT highscores_for_steps_and_card(s.id, 591, 1) FROM stomp_steps s;
> Seq Scan on stomp_steps s (cost=0.00..793.52 rows=2902 width=4)
> (actual time=26509.919..26509.919 rows=0 loops=1)
> Total runtime: 26509.972 ms
>
Hello Glenn,

Stomp_steps is analyzed to 2902 rows but when you run the query the
actual rows are 0. This means that the highscore function is not called
or the number 0 is incorrect.
Suppose that the number of rows is 2900, then 26 seconds means 100ms per
function call. This still is a lot, compared to the 0.054 ms analyze
result below. The truth might be that you probably got that result by
explaining the query in the function with actual parameter values. This
plan differs from the one that is made when the function is called from
sql and is planned (once) without parameters, and in that case the plan
is probably different. A way to check the plan of that query is to turn
on debug_print_plan and watch the server log. It takes a bit getting
used. The plan starts with CONTEXT: SQL function "functionname" during
startup and is also recognized because in the opexpr (operator
expression) one of the operands is a parameter. Important is the total
cost of the top plan node (the limit).

I know 8.3 is mentioned in the subject, but I think that a WITH query
(http://www.postgresql.org/docs/8.4/interactive/queries-with.html) could
be a good solution to your problem and may be worth trying out, if you
have the possibility to try out 8.4.

Regards,
Yeb Havinga

> The inner function looks like this:
>
> CREATE FUNCTION highscores_for_steps_and_card(steps_id int, card_id
> int, count int) RETURNS SETOF INTEGER LANGUAGE SQL AS $$
> SELECT r.id FROM stomp_round r
> WHERE ($1 IS NULL OR r.steps_id = $1) AND ($2 IS NULL OR
> r.user_card_id = $2)
> ORDER BY r.score DESC LIMIT $3
> $$
>
> Limit (cost=13.12..13.12 rows=1 width=8) (actual time=0.054..0.054
> rows=0 loops=1)
> -> Sort (cost=13.12..13.12 rows=1 width=8) (actual
> time=0.051..0.051 rows=0 loops=1)
> Sort Key: score
> Sort Method: quicksort Memory: 17kB
> -> Bitmap Heap Scan on stomp_round r (cost=9.09..13.11
> rows=1 width=8) (actual time=0.036..0.036 rows=0 loops=1)
> Recheck Cond: ((280 = steps_id) AND (user_card_id = 591))
> -> BitmapAnd (cost=9.09..9.09 rows=1 width=0) (actual
> time=0.032..0.032 rows=0 loops=1)
> -> Bitmap Index Scan on stomp_round_steps_id
> (cost=0.00..4.40 rows=20 width=0) (actual time=0.030..0.030 rows=0
> loops=1)
> Index Cond: (280 = steps_id)
> -> Bitmap Index Scan on stomp_round_user_card_id
> (cost=0.00..4.44 rows=25 width=0) (never executed)
> Index Cond: (user_card_id = 591)
> Total runtime: 0.153 ms
> (12 rows)
>
> stomp_steps has about 1500 rows, so it finds 1500 high scores, one for
> each stage.
>
> I expected scalability issues from this on a regular drive, since
> it'll be doing a ton of index seeking when not working out of cache,
> so I expected to need to change to an SSD at some point (when it no
> longer easily fits in cache). However, I/O doesn't seem to be the
> bottleneck yet. If I run it several times, it consistently takes 26
> seconds. The entire database is in OS cache (find | xargs cat:
> 250ms).
>
> I'm not sure why the full query (26s) is orders of magnitude slower
> than 1500*0.150ms (225ms). It's not a very complex query, and I'd
> hope it's not being re-planned every iteration through the loop. Any
> thoughts? Using SELECT to iterate over a table like this is very
> useful (and I don't know any practical alternative), but it's
> difficult to profile since it doesn't play nice with EXPLAIN ANALYZE.
>
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Matt White 2010-02-02 00:53:56 Slow-ish Query Needs Some Love
Previous Message Віталій Тимчишин 2010-02-01 13:30:57 Re: Constraint propagating for equal fields