Skip site navigation (1) Skip section navigation (2)

Interesting performance for function between 7.4.6 and 8.0rc2

From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: pgsql-hackers(at)postgresql(dot)org, pgsql-general(at)postgresql(dot)org,plphp(at)commandprompt(dot)com
Subject: Interesting performance for function between 7.4.6 and 8.0rc2
Date: 2004-12-21 21:32:48
Message-ID: 41C89680.2000209@commandprompt.com (view raw or flat)
Thread:
Lists: pgsql-generalpgsql-hackers
Hello,

First some specifics:

OS Linux FC1
1 Gig of Ram
IDE RAID 1 array
AMD 2200 XP

Running both 8.0rc2 and 7.4.6 for testing. The function is a plphp
function. The code for the function is below:


CREATE OR REPLACE FUNCTION get_users(bigint[]) RETURNS SETOF 
resolved_users_type AS '
     $userData = array();
     $temp = array();
     foreach ($args[0] as $value) { $temp[] = $value."::bigint"; }
     $res = spi_exec_query("SELECT id, realname, email FROM users WHERE 
id IN (".implode(",", $temp).")");
     if ($res[''processed''] > 0) {
         while ($row = spi_fetch_row($res)) {
             $userData[] = $row;
         }
     }
return ($userData) ? $userData : NULL;
' LANGUAGE 'plphp';


Basically we take an array of values and look up a list of users based 
on the elements in the array and return the results. There is a static
cast for 7.4.6 sake which may not be required for 8.

Here is a sample dataset for the users table:

  id  |      realname      |            email
------+--------------------+------------------------------
     2 | Jonathan Daugherty | jdaugherty(at)commandprompt(dot)com
     4 | Steven1 Klassen    | sklassen1(at)commandprompt(dot)com
     8 | Steven1 Klassen    | sklassen1(at)commandprompt(dot)com
     9 | Steven1 Klassen    | sklassen1(at)commandprompt(dot)com
    12 | Steven1 Klassen    | sklassen1(at)commandprompt(dot)com
    56 | Jonathan Daugherty | jdaugherty(at)commandprompt(dot)com
  2000 | Steven Klassen     | sklassen(at)commandprompt(dot)com
    23 | Steven1 Klassen    | sklassen1(at)commandprompt(dot)com
   690 | Steven Klassen     | sklassen(at)commandprompt(dot)com
  4000 | Steven Klassen     | sklassen(at)commandprompt(dot)com
  1243 | Steven Klassen     | sklassen(at)commandprompt(dot)com
     5 | Steven1 Klassen    | sklassen1(at)commandprompt(dot)com
     6 | Steven1 Klassen    | sklassen1(at)commandprompt(dot)com
   230 | Steven Klassen     | sklassen(at)commandprompt(dot)com
   330 | Steven Klassen     | sklassen(at)commandprompt(dot)com
   430 | Steven Klassen     | sklassen(at)commandprompt(dot)com
   440 | Steven Klassen     | sklassen(at)commandprompt(dot)com
   550 | Steven Klassen     | sklassen(at)commandprompt(dot)com
   660 | Steven Klassen     | sklassen(at)commandprompt(dot)com
   770 | Steven Klassen     | sklassen(at)commandprompt(dot)com


Here is the functional_query we are running:

SELECT * FROM 
get_users('{2,4,8,9,12,56,2000,23,690,4000,1243,5,6,230,330,430,440,550,660,770}');

Statistics is set to 50 on the id column (which is way to high). It has 
about 51000 rows.

8.0 first run: 109.664 ms
8.0 second run: 2.939 ms

7.4.6 first run: 121.416ms
7.4.6 second run: 8.665 ms

So this shows a significant increase in functions performance for second 
+ runs and a little difference for first runs. If I enable preload 
libraries the numbers are the following:

8.0 first run: 107.689ms
8.0 second run: 2.915 ms

7.4.6 first run: 119.400 ms
7.4.6 second run: 8.629 ms

It doesn't look like preload libraries helps as much as I would like but 
every little bit helps. The really good news of course is the amazing
increase in performance from 7.4.6 to 8.0. I assume this is because
we are now keeping statistics for functions.

The changes in 8.0 should help projects like TSearch2 quite a bit.

One thing I did note that is very odd is:

sklassen=# explain analyze select * from 
get_users('{2,4,8,9,12,56,2000,23,690,4000,1243,5,6,230,330,430,440,550,660,770}');
                                                   QUERY PLAN
--------------------------------------------------------------------------------------------------------------
  Function Scan on get_users  (cost=0.00..12.50 rows=1000 width=72) 
(actual time=2.464..2.488 rows=20 loops=1)
  Total runtime: 2.520 ms


SELECT id, realname, email FROM users WHERE id IN 
(2::bigint,4::bigint,8::bigint,9::bigint,12::bigint,56::bigint,2000::bigint,23::bigint,690::bigint,4000::bigint,1243::bigint,5::bigint,6::bigint,230::bigint,330::bigint,430::bigint,440::bigint,550::bigint,660::bigint,770::bigint)"
LOG:  duration: 2.937 ms


Notice the two durations, they are different but for the exact same run 
on the query. Is the duration being calculated between psql and the backend?


Sincerely,

Joshua D. Drake

-- 
Command Prompt, Inc., home of PostgreSQL Replication, and plPHP.
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd(at)commandprompt(dot)com - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL

Attachment: jd.vcf
Description: text/x-vcard (640 bytes)

pgsql-hackers by date

Next:From: Andrew DunstanDate: 2004-12-21 21:42:50
Subject: Re: plperl: memory usage stacking
Previous:From: bizDate: 2004-12-21 21:28:21
Subject: plperl: memory usage stacking

pgsql-general by date

Next:From: Jim C. NasbyDate: 2004-12-21 21:36:10
Subject: PostgreSQL training curriculum
Previous:From: Jim C. NasbyDate: 2004-12-21 21:29:33
Subject: Re: Scheduler in Postgres

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group