Interesting performance for function between 7.4.6 and 8.0rc2

From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: pgsql-hackers(at)postgresql(dot)org, pgsql-general(at)postgresql(dot)org, plphp(at)commandprompt(dot)com
Subject: Interesting performance for function between 7.4.6 and 8.0rc2
Date: 2004-12-21 21:32:48
Message-ID: 41C89680.2000209@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Hello,

First some specifics:

OS Linux FC1
1 Gig of Ram
IDE RAID 1 array
AMD 2200 XP

Running both 8.0rc2 and 7.4.6 for testing. The function is a plphp
function. The code for the function is below:

CREATE OR REPLACE FUNCTION get_users(bigint[]) RETURNS SETOF
resolved_users_type AS '
$userData = array();
$temp = array();
foreach ($args[0] as $value) { $temp[] = $value."::bigint"; }
$res = spi_exec_query("SELECT id, realname, email FROM users WHERE
id IN (".implode(",", $temp).")");
if ($res[''processed''] > 0) {
while ($row = spi_fetch_row($res)) {
$userData[] = $row;
}
}
return ($userData) ? $userData : NULL;
' LANGUAGE 'plphp';

Basically we take an array of values and look up a list of users based
on the elements in the array and return the results. There is a static
cast for 7.4.6 sake which may not be required for 8.

Here is a sample dataset for the users table:

id | realname | email
------+--------------------+------------------------------
2 | Jonathan Daugherty | jdaugherty(at)commandprompt(dot)com
4 | Steven1 Klassen | sklassen1(at)commandprompt(dot)com
8 | Steven1 Klassen | sklassen1(at)commandprompt(dot)com
9 | Steven1 Klassen | sklassen1(at)commandprompt(dot)com
12 | Steven1 Klassen | sklassen1(at)commandprompt(dot)com
56 | Jonathan Daugherty | jdaugherty(at)commandprompt(dot)com
2000 | Steven Klassen | sklassen(at)commandprompt(dot)com
23 | Steven1 Klassen | sklassen1(at)commandprompt(dot)com
690 | Steven Klassen | sklassen(at)commandprompt(dot)com
4000 | Steven Klassen | sklassen(at)commandprompt(dot)com
1243 | Steven Klassen | sklassen(at)commandprompt(dot)com
5 | Steven1 Klassen | sklassen1(at)commandprompt(dot)com
6 | Steven1 Klassen | sklassen1(at)commandprompt(dot)com
230 | Steven Klassen | sklassen(at)commandprompt(dot)com
330 | Steven Klassen | sklassen(at)commandprompt(dot)com
430 | Steven Klassen | sklassen(at)commandprompt(dot)com
440 | Steven Klassen | sklassen(at)commandprompt(dot)com
550 | Steven Klassen | sklassen(at)commandprompt(dot)com
660 | Steven Klassen | sklassen(at)commandprompt(dot)com
770 | Steven Klassen | sklassen(at)commandprompt(dot)com

Here is the functional_query we are running:

SELECT * FROM
get_users('{2,4,8,9,12,56,2000,23,690,4000,1243,5,6,230,330,430,440,550,660,770}');

Statistics is set to 50 on the id column (which is way to high). It has
about 51000 rows.

8.0 first run: 109.664 ms
8.0 second run: 2.939 ms

7.4.6 first run: 121.416ms
7.4.6 second run: 8.665 ms

So this shows a significant increase in functions performance for second
+ runs and a little difference for first runs. If I enable preload
libraries the numbers are the following:

8.0 first run: 107.689ms
8.0 second run: 2.915 ms

7.4.6 first run: 119.400 ms
7.4.6 second run: 8.629 ms

It doesn't look like preload libraries helps as much as I would like but
every little bit helps. The really good news of course is the amazing
increase in performance from 7.4.6 to 8.0. I assume this is because
we are now keeping statistics for functions.

The changes in 8.0 should help projects like TSearch2 quite a bit.

One thing I did note that is very odd is:

sklassen=# explain analyze select * from
get_users('{2,4,8,9,12,56,2000,23,690,4000,1243,5,6,230,330,430,440,550,660,770}');
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Function Scan on get_users (cost=0.00..12.50 rows=1000 width=72)
(actual time=2.464..2.488 rows=20 loops=1)
Total runtime: 2.520 ms

SELECT id, realname, email FROM users WHERE id IN
(2::bigint,4::bigint,8::bigint,9::bigint,12::bigint,56::bigint,2000::bigint,23::bigint,690::bigint,4000::bigint,1243::bigint,5::bigint,6::bigint,230::bigint,330::bigint,430::bigint,440::bigint,550::bigint,660::bigint,770::bigint)"
LOG: duration: 2.937 ms

Notice the two durations, they are different but for the exact same run
on the query. Is the duration being calculated between psql and the backend?

Sincerely,

Joshua D. Drake

--
Command Prompt, Inc., home of PostgreSQL Replication, and plPHP.
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd(at)commandprompt(dot)com - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL

Attachment Content-Type Size
jd.vcf text/x-vcard 640 bytes

Browse pgsql-general by date

  From Date Subject
Next Message Jim C. Nasby 2004-12-21 21:36:10 PostgreSQL training curriculum
Previous Message Jim C. Nasby 2004-12-21 21:29:33 Re: Scheduler in Postgres

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2004-12-21 21:42:50 Re: plperl: memory usage stacking
Previous Message biz 2004-12-21 21:28:21 plperl: memory usage stacking