Re: Slow query: table iteration (8.3)

From: Yeb Havinga <yebhavinga(at)gmail(dot)com>
To: Glenn Maynard <glenn(at)zewt(dot)org>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow query: table iteration (8.3)
Date: 2010-02-05 11:17:07
Message-ID: 4B6BFE33.5080809@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Glenn Maynard wrote:
> The function version:
>
> CREATE FUNCTION high_score_for_user(user_id int) RETURNS SETOF INTEGER
> LANGUAGE SQL AS $$
> SELECT t.id FROM test t
> WHERE t.user_id = $1
> ORDER BY t.score DESC LIMIT 1
> $$;
> SELECT high_score_for_user(u.id) FROM test_users u;
>
> runs in 100ms.
>
Hi Glenn,

About cached plans of SQL functions: from the source of function.c

00067 /*
00068 * An SQLFunctionCache record is built during the first call,
00069 * and linked to from the fn_extra field of the FmgrInfo struct.
00070 *
00071 * Note that currently this has only the lifespan of the calling
query.
00072 * Someday we might want to consider caching the parse/plan
results longer
00073 * than that.
00074 */

So it is planned at every call of

SELECT high_score_for_user(u.id) FROM test_users u;

and the cache is used between each row of test_users. The plan is with a
parameter, that means the optimizer could not make use of an actual
value during planning. However, your test case is clever in the sense
that there is an index on users and score and the sql function has an
order by that matches the index, so the planner can avoid a sort by
accessing the test table using the index. In this particular case, that
means that the plan is optimal; no unneeded tuples are processed and the
(function) plan complexity is logaritmic on the size of the test
relation, you can't get it any better than that. In short: the lack of
an actual parameter in the test case did not result in an inferior plan.
So using a dynamic constructed query string in pl/pgsql to 'force'
replanning during iteration cannot be faster than this sql function.

It is possible to make the performance if this function worse by
disabling indexscans:

CREATE FUNCTION high_score_for_user(user_id int) RETURNS SETOF INTEGER
LANGUAGE SQL AS $$
SELECT t.id FROM test t
WHERE t.user_id = $1
ORDER BY t.score DESC LIMIT 1
$$
SET enable_indexscan = off;

Now the query time with test_users is over a second. So maybe the
converse could also be true in your production setup using the same
technique.

regards,
Yeb Havinga

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Wojtek 2010-02-05 12:32:37 index on partitioned table
Previous Message Albe Laurenz 2010-02-05 09:00:09 Re: foreign key constraint lock behavour in postgresql