ORDER BY user defined function performance issues

From: Nick Trainor <nick(dot)trainor(at)trainorthornton(dot)co(dot)uk>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: ORDER BY user defined function performance issues
Date: 2004-06-11 11:14:55
Message-ID: 20040611110837.E5F5215D888@ns.trainorthornton.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I have an application which logs interactions on a regular basis. The interaction details (their types, etc) are held in one table (tblitem) and the 'hits' are held in tbltotal.

I have written a function to get the total 'hits' during a period and need to collect together the information from tblitem with it.

The query works OK returning results in under a second:

EXPLAIN ANALYSE SELECT t1.value1,t1.value2,getday_total('1','23',t1.id::integer,'31','59','2','2004','182','153','6','2004','0') FROM tblitem t1 WHERE t1.type_id=23::int2 and (t1.id >= 1::int8 and t1.id<=9223372036854775807::int8)
OFFSET 0 LIMIT 20;
tracker-# QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..7.70 rows=20 width=56) (actual time=19.50..846.89 rows=20 loops=1)
-> Index Scan using tblitemtype_id on tblitem t1 (cost=0.00..230.10 rows=598 width=56) (actual time=19.49..846.81 rows=21 loops=1)
Index Cond: (type_id = 23::smallint)
Filter: ((id >= 1::bigint) AND (id <= 9223372036854775807::bigint))
Total runtime: 847.04 msec

----
I realised that Postgresql did not like passing t1.id to the function without some form of constraints - hence the (t1.id >= 1::int8 and t1.id<=9223372036854775807::int8) dummy constraints.
----

However, when I seek to ORDER the results, then it takes 'forever':

EXPLAIN ANALYSE SELECT t1.value1,t1.value2,
getday_total('1','23',t1.id::integer,'31','59','2','2004','182','153','6','2004','0') FROM tblitem t1 WHERE t1.type_id=23::int2 and (t1.id >= 1::int8 and t1.id<=9223372036854775807::int8)
ORDER BY getday_total('1','23',t1.id::integer,'31','59','2','2004','182','153','6','2004','0') DESC
OFFSET 0 LIMIT 20;

tracker-# tracker-#
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=257.66..257.71 rows=20 width=56) (actual time=25930.90..25930.95 rows=20 loops=1)
-> Sort (cost=257.66..259.15 rows=598 width=56) (actual time=25930.90..25930.91 rows=21 loops=1)
Sort Key: getday_total(1::smallint, 23::smallint, (id)::integer, 31::smallint, 59::smallint, 2::smallint, 2004::smallint, 182::smallint, 153::smallint, 6::smallint, 2004::smallint, 0)
-> Index Scan using tblitemtype_id on tblitem t1 (cost=0.00..230.10 rows=598 width=56) (actual time=19.60..25927.68 rows=693 loops=1)
Index Cond: (type_id = 23::smallint)
Filter: ((id >= 1::bigint) AND (id <= 9223372036854775807::bigint))
Total runtime: 25931.15 msec

And this is a database of only a few thousand rows, we are anticipating that this database is going to get huge.

What am I missing here? How can I get it to order by the total of interactions without hitting the performance problem?

Any help would be much appreciated.

Nick

nick A-T trainorthornton d-o-t co d-o-t uk

Version:
PostgreSQL 7.3.4 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.2 (Mandrake Linux 9.1 3.2.2-3mdk)

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Paul Thomas 2004-06-11 13:41:08 Re: ORDER BY user defined function performance issues
Previous Message Stephan Szabo 2004-06-10 17:14:36 Re: *very* inefficient choice made by the planner (regarding