Re: Performance Tuning Large PL/PGSQL Stored Procedure

From: Eliot Gable <egable+pgsql-performance(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance Tuning Large PL/PGSQL Stored Procedure
Date: 2010-03-26 03:56:35
Message-ID: bf6923ed1003252056m4c40660em6c7d92fc3a917e0c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Mar 25, 2010 at 10:00 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:

> On Tue, Mar 23, 2010 at 5:00 PM, Eliot Gable
> <egable+pgsql-performance(at)gmail(dot)com <egable%2Bpgsql-performance(at)gmail(dot)com>>
> wrote:
> > The complex type contains roughly 25 fields, mostly text, plus another 10
> > REFCURSORs.
>
> How many rows min/max/avg are coming back in your refcursors? Are you
> using cursors in order to return multiple complex data structures
> (sets, etc) in a single function call?
>
>
I think the largest number of rows is around 40. Most are substantially
smaller. However, most of them have about two dozen or more columns, and I
have already shortened the list of columns to the minimum possible. The
average number of rows is around 10, but the largest sets of rows also have
the most columns. I'm using the cursors in order to obtain multiple complex
data structures in a single function call.

> > The application that calls the stored procedure was also written by me in
> > C++ and uses asynchronous libpq API commands to execute a single SQL
> > transaction which calls the stored procedure and also performs a FETCH
> ALL
> > on all open cursors. It then returns all results into various structures.
> > All rows of all cursors that are open are always used for every call to
> the
> > stored procedure.
> >
> > The stored procedure implements various logic which determines which
> tables
> > in the database to query and how to filter the results from those queries
> to
> > return only the relevant information needed by the C++ application.
> >
> > Currently, in terms of optimization, I have taken the following
> approaches
> > based on the following reasoning:
> >
> > 1. For all queries whose results need to return to the C++ application, I
> > utilize cursors so that all results can be readied and generated by the
> > stored procedure with just one call to the PostgreSQL backend. I
> accomplish
> > this using asynchronous libpq API calls to issue a single transaction to
> the
> > server. The first command after the BEGIN is a SELECT * FROM
> > MyStoredProc(blah), which is then followed by FETCH ALL commands for each
> > cursor that the stored procedure leaves open. I then follow up with
> multiple
> > API calls to return the results and retrieve the rows from those results.
> > This minimizes the amount of back-and-forth between my C++ application
> and
> > the database backend.
> >
> > 1a. Incidentally, I am also using cursors for most queries inside the
> stored
> > procedure that do not return results to the C++ application. I am unsure
> > whether this incurs a performance penalty compared to doing, for example,
> a
> > SELECT ... INTO (var1, var2, ...) within the body of the stored
> procedure.
> > Instead of using SELECT ... INTO, I am using OPEN cursor_name; FETCH
> > cursor_name INTO (var1, var2).
> >
> > 2. I have built indexes on all columns that are used in where clauses and
> > joins.
> >
> > 3. I use lots of joins to pull in data from various tables (there are
> around
> > 60 tables that are queried with each call to the stored procedure).
> >
> > 4. When performing joins, the first table listed is the one that returns
> the
> > most context-specific results, which always also means that it has the
> > most-specific and fewest number of relevant rows. I then join them in
> order
> > of least number of result rows with all inner joins preceding left outer
> > joins.
> >
> > 5. Instead of using UNIONs and EXCEPT clauses, I use multiple WITH
> clauses
> > to define several different query-specific views. I order them such that
> I
> > can join additional tables in later WITH clauses to the views created
>
> WITH clauses can make your queries much easier to read and yield great
> speedups if you need to access the table expression multiple times
> from other parts of the query. however, in some cases you can get
> into trouble because a standard set of joins is going to give the
> planner the most flexibility in terms of query optimization.
>
>
So far, every case I have converted to WITH clauses has resulted in more
than double the speed (half the time required to perform the query). The
main reason appears to be from avoiding calculating JOIN conditions multiple
times in different parts of the query due to the UNION and EXCEPT clauses.

> > previously in a way that minimizes the number of rows involved in the
> JOIN
> > operations while still providing provably accurate result sets. The
> EXCEPT
> > clauses are then replaced by also defining one view which contains a set
> of
> > IDs that I want filtered from the final result set and using a WHERE id
> NOT
> > IN (SELECT id FROM filtered_view). Typically, this approach leaves me
> with
> > just one UNION of two previously defined views (the union is required
>
>
> UNION is always an optimization target (did you mean UNION ALL?)
>
>
Thanks for the suggestion on UNION ALL; I indeed do not need elimination of
duplicates, so UNION ALL is a better option.

> > 7. When I have a query I need to execute whose results will be used in
> > several other queries, I currently open the cursor for that query using
> the
> > FOR ... LOOP construct to retrieve all records in the result set and
> build a
> > result array using the array_append() method. I then do an
> unnest(my_array)
>
> do not use array_append. always do array(select ...) whenever it is
> possible. when it isn't, rethink your problem until it is possible.
> only exception is to use array_agg aggregate if your problem really is
> an aggregation type of thing. as a matter of fact, any for...loop is
> an optimization target because a re-think will probably yield a query
> that does the same thing without paying for the loop.
>
>
I suspected it was a performance issue. I will see if I can find an
alternative way of doing it. Based on your feedback, I think I may know how
to do it now.

> >
> > For most of the joins, they simply join on foreign key IDs and no
> additional
> > filtering criteria are used on their information. Only a handful of the
> > joined tables bring in additional criteria by which to filter the result
> > set.
> >
> > The approach used in 7 with cursors and building a result array which is
> > then unnested has me worried in terms of performance. It seems to me
> there
> > should be some better way to accomplish the same thing.
> >
> > The stored procedure does not perform updates or inserts, only selects.
> >
> >
> > Anyway, if anyone has some insights into performance tweaks or new
> > approaches I might try that may lead to enhanced performance, I would
> > greatly appreciate hearing about them. I am not completely dissatisfied
> with
> > the performance of the stored procedure, but this is going to be used in
> a
> > very high volume environment (hundreds or possibly even thousands of
> calls
> > to this stored procedure every second). The more performant it is, the
> less
> > hardware I need to deploy. It currently takes about 45ms to execute the
> > query and retrieve all of the results into the C++ application. Query
> > execution time takes up about 16ms of that 45ms. This is on a 3-year old
> > Core 2 Duo, so it's not exactly top-of-the-line hardware.
>
> If you are chasing milliseconds, using C/C++, and dealing with complex
> data structures coming in/out of the database, I would absolutely
> advise you to check out the libpqtypes library (disclaimer, I co-wrote
> it!) in order to speed up data transfer. The library is highly
> optimized and facilitates all transfers in binary which yields good
> gains when sending types which are expensive to hammer to text (bytea,
> timestamp, etc).
>

The data returned from the application is just rows of strings, numbers
(ints and doubles), and booleans. Would I see a good speedup with libpqtypes
when dealing with those data types?

> In addition, using libpqtypes you can use arrays of composites (in
> 8.3+) to send/receive complex structures (even trees, etc) and pull
> the entire set of data in a single query. This is an alternative to
> the refcursor/fetch method which involves extra round trips and has
> other problems (but is the way to go if you need to progressive fetch
> large amounts of data).
>

So, you are saying that I can return a complex type as a result which
contains arrays of other complex types and just use my single SELECT command
to retrieve the whole data set? That would be much simpler and I imagine
must faster.

> As a general tip, I suggest 'divide and conquer'. Sprinkle your
> procedure with 'raise notice %', gettimeofday(); And record the time
> spent on the various steps of the execution. This will give better
> measurements then pulling pieces of the function out and running them
> outside with constants for the arguments. Identify the problem spots
> and direct your energies there.
>

As a matter of fact, I have a callback messaging function set up so that my
RAISE NOTICE commands call back to my C++ program and generate log messages.
The log messages show date + time + microseconds, so I can see how long it
takes to go through each part.

I really am chasing milliseconds here, and I appreciate all your feedback.
You've given me a relatively large number of possible optimizations I can
try out. I will definitely try out the libpqtypes. That sounds like a
promising way to further cut down on execution time. I think most of my
performance penalty is in transfering the results back to the C++
application.

--
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be a
crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live; not
live to eat.) ~Marcus Tullius Cicero

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Merlin Moncure 2010-03-26 12:06:49 Re: Performance Tuning Large PL/PGSQL Stored Procedure
Previous Message Tadipathri Raghu 2010-03-26 03:19:21 Re: Why Wal_buffer is 64KB