Re: Help request to improve function performance

From: Filip Rembiałkowski <plk(dot)zuber(at)gmail(dot)com>
To: sarikan <serefarikan(at)kurumsalteknoloji(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Help request to improve function performance
Date: 2009-04-22 20:58:04
Message-ID: 92869e660904221358g7b7a8592iaec2fe7c9a290f68@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

2009/4/22 sarikan <serefarikan(at)kurumsalteknoloji(dot)com>

>
> Dear members of the list,
> I have a function which returns a custom type, that has only two fields,
> each of them being varchar arrays.
> The reason that I have written this function is that I have a table
> basically with the following structure (with simplified column names)
>
> name_col1 name_col2 sessionId
> value1 value3 id1
> value2 value2 id1
> value4 value4 id1
> value7 value4 id2
> value2 value2 id2
> value4 value4 id2
> value1 value5 id3
>

Why not post your REAL schema? It would make life easier, both for you and
for people trying to help.

>
> So mutliple rows are bound together with sessionIds, and I need to get back
> all rows with a query, grouped by sessionID. However, group by sql
> statement
> does not solve my problem, since I get back a lot of rows, which I have to
> group into objects again in my application. What I need is a way to return
> all rows having the same sessionId as a single row. Of course this is not
> possible with this table, so I've created a custom type, which has array
> type columns. The following function gets all rows that belongs to a
> patient, and for each session id, it inserts rows with that session id into
> array fields of the custom type.
> The problem is, it is very slow! Getting back all the rows with a select
> takes 360 ms, while getting back the results of this function takes 50
> seconds! Is there any way I can make the following function faster, or any
> other methods you can recommend to do what I'm trying to do? I am trying
> to
> avoid hundreds of calls to db, or grouping query results in my middleware
> application. Here comes the function, and your help will be much
> appreciated.
>
> Best Regards
> Seref
>
>
(below code edited to be more readable; logic unchanged)

>
> CREATE or REPLACE FUNCTION getNodeContainers( context_Id varchar) RETURNS
> setof NodesContainer AS
> $$
> DECLARE
> archetype_data_row app.archetype_data;
> archetype_data_row_main app.archetype_data;
> nodescontainervar NodesContainer;
> session_Id varchar;
> indexVar integer := 0;
> BEGIN
> CREATE TEMP TABLE all_rows_of_patient AS select * from
> app.archetype_data
> WHERE context_id = context_Id;
> FOR session_Id IN
> SELECT distinct session_id from all_rows_of_patient
> LOOP -- do the following for each session_ID
> indexVar := 0;
> FOR archetype_data_row IN --select rows that belong to this session
> ID
> SELECT * from all_rows_of_patient
> WHERE session_id = session_Id and context_id = context_Id
> LOOP
> nodescontainervar.name[indexVar] := archetype_data_row.name
> ;
> nodescontainervar.context_Id[indexVar] :=
> archetype_data_row.context_Id;
> indexVar := indexVar + 1;
> END LOOP;
> return NEXT nodescontainervar;
> END LOOP;
> drop table all_rows_of_patient;
> return;
> END;
> $$ LANGUAGE 'plpgsql';
>
>
please read above code - thats what postgres actually executes. column names
have precedence before variable names in name resolution.
conditions like
WHERE context_id = context_Id;
WHERE session_id = session_Id and context_id = context_Id
are obviously no-op conditions, not what you really want.

I hope now it's clear now why this function has long execution time :)

some other remarks:

1) you use temp tables inside a function, which is rather bad (search
archives for explanation). try to avoid it.
2) usage of indexvar is not needed - there are array operators and functions
3) if you get rid of temp table, this function could be marked as
STABLE,which will prevent penalty in some strange situations
4) mark your function as STRICT, which will save some CPU cycles when
someone calls it on null input

HTH.

--
Filip Rembiałkowski
JID,mailto:filip(dot)rembialkowski(at)gmail(dot)com
http://filip.rembialkowski.net/

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Hiroshi Inoue 2009-04-22 21:35:28 Re: trouble with to_char('L')
Previous Message Filip Rembiałkowski 2009-04-22 20:17:08 Re: thanks for the "testing" replies ; now my first question - Logs say update done but not actually done or committed into database