Re: immutable functions vs. join for lookups ?

From: Dawid Kuroczko <qnex42(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: immutable functions vs. join for lookups ?
Date: 2005-04-18 09:00:38
Message-ID: 758d5e7f05041802004ee56be4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 4/15/05, Enrico Weigelt <weigelt(at)metux(dot)de> wrote:
> a) SELECT items.a, items.b, ..., users.username FROM items, users
> WHERE items.uid = users.uid;
>
> c) CREATE FUNCTION id2username(oid) RETURNS text
> LANGUAGE 'SQL' IMMUTABLE AS '
> SELECT username AS RESULT FROM users WHERE uid = $1';

You will be told that this function is not immutable but stable, and this
is quite right. But consider such a function:

CREATE OR REPLACE FUNCTION id2username (oid int) RETURNS TEXT AS $$
BEGIN
IF oid = 0 THEN RETURN 'foo';
ELSIF oid = 1 THEN RETURN 'bar';
END IF;
END;
$$ LANGUAGE plpgsql IMMUTABLE;

versus a lookup table with similar data. Logic suggests it should be faster
than a table... It got me worried when I added: "RAISE WARNING 'Called'"
after begin and I got lots of "Called" warnings when using this IMMUTABLE
function in select... And the timings for ~6000 values in aaa table
(and two values in lookup table) are:

There is a query, output of the EXPLAIN ANALYZE, Time of EXPLAIN
ANALYZE and "Real time" of SELECT (without EXPLAIN ANALYZE):

a) simple select from temp table, and a lookup cost:
EXPLAIN ANALYZE SELECT n FROM aaa;
Seq Scan on aaa (cost=0.00..87.92 rows=5992 width=4) (actual
time=0.011..24.849 rows=6144 loops=1)
Total runtime: 51.881 ms
(2 rows)
Time: 52,882 ms
Real time: 16,261 ms

EXPLAIN ANALYZE SELECT id2username(n) FROM aaa limit 2;
Limit (cost=0.00..0.03 rows=2 width=4) (actual time=0.111..0.150
rows=2 loops=1)
-> Seq Scan on aaa (cost=0.00..104.80 rows=6144 width=4) (actual
time=0.102..0.129 rows=2 loops=1)
Total runtime: 0.224 ms
(3 rows)
Time: 1,308 ms
Real time: 1,380 ms

b) natural join with lookup table:
EXPLAIN ANALYZE SELECT username FROM aaa NATURAL JOIN lookup;
Hash Join (cost=2.45..155.09 rows=3476 width=32) (actual
time=0.306..83.677 rows=6144 loops=1)
Hash Cond: ("outer".n = "inner".n)
-> Seq Scan on aaa (cost=0.00..87.92 rows=5992 width=4) (actual
time=0.006..25.517 rows=6144 loops=1)
-> Hash (cost=2.16..2.16 rows=116 width=36) (actual
time=0.237..0.237 rows=0 loops=1)
-> Seq Scan on lookup (cost=0.00..2.16 rows=116 width=36)
(actual time=0.016..0.034 rows=2 loops=1)
Total runtime: 107.378 ms
(6 rows)
Time: 109,040 ms
Real time: 25,364 ms

c) IMMUTABLE "static" lookup function:
EXPLAIN ANALYZE SELECT id2username(n) FROM aaa;
Seq Scan on aaa (cost=0.00..104.80 rows=6144 width=4) (actual
time=0.089..116.397 rows=6144 loops=1)
Total runtime: 143.800 ms
(2 rows)
Time: 144,869 ms
Real time: 102,428 ms

d) self-join with a function ;)
EXPLAIN ANALYZE SELECT * FROM (SELECT n, id2username(n) AS username
FROM (SELECT DISTINCT n FROM aaa) AS values) AS v_lookup RIGHT JOIN
aaa USING (n);
Hash Left Join (cost=506.82..688.42 rows=6144 width=36) (actual
time=102.382..182.661 rows=6144 loops=1)
Hash Cond: ("outer".n = "inner".n)
-> Seq Scan on aaa (cost=0.00..89.44 rows=6144 width=4) (actual
time=0.012..24.360 rows=6144 loops=1)
-> Hash (cost=506.82..506.82 rows=2 width=36) (actual
time=102.217..102.217 rows=0 loops=1)
-> Subquery Scan v_lookup (cost=476.05..506.82 rows=2
width=36) (actual time=53.626..102.057 rows=2 loops=1)
-> Subquery Scan "values" (cost=476.05..506.80 rows=2
width=4) (actual time=53.613..102.023 rows=2 loops=1)
-> Unique (cost=476.05..506.77 rows=2 width=4)
(actual time=53.456..101.772 rows=2 loops=1)
-> Sort (cost=476.05..491.41 rows=6144
width=4) (actual time=53.440..76.710 rows=6144 loops=1)
Sort Key: n
-> Seq Scan on aaa
(cost=0.00..89.44 rows=6144 width=4) (actual time=0.013..26.626
rows=6144 loops=1)
Total runtime: 209.378 ms
(11 rows)
Time: 211,460 ms
Real time: 46,682 ms

...so this IMMUTABLE is twice as slow (~100 ms) as the query joining
itself with a SELECT DISTINCT on an IMMUTABLE function (~50 ms),
which is twice as slow as JOIN against lookup table (~25 ms), and I feel
this IMMUTABLE function could be around ~20 ms (~16 ms plus
calling the function two times plus giving the values).

Ah, and this is PostgreSQL 8.0.1 running under FreeBSD on a
CPU: Intel(R) Celeron(R) CPU 2.40GHz (2400.10-MHz 686-class CPU).

Regards,
Dawid

PS: I have a feeling that IMMUTABLE functions worked better in 7.4,
yet I am unable to confirm this.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Gordon Haverland 2005-04-18 11:12:03 Debian stable, was Re: Storing Large Objects
Previous Message Nick Trainor 2005-04-18 08:07:23 UNSUBSCRIBE