Re: Replacing Cursors with Temporary Tables

From: Scott Carey <scott(at)richrelevance(dot)com>
To: Eliot Gable <egable+pgsql-performance(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Replacing Cursors with Temporary Tables
Date: 2010-04-22 03:13:16
Message-ID: 4FC14C16-B896-44BF-BCA4-9B324B3670E0@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On Apr 21, 2010, at 1:16 PM, Eliot Gable wrote:

> I have previously discussed my very long PL/PGSQL stored procedure on this list. However, without getting into too many details, I have another performance-related question.
>
> The procedure currently uses cursors to return multiple result sets to the program executing the procedure. Basically, I do this:
>
> CREATE TEMPORARY TABLE table2 WITH (OIDS=FALSE) ON COMMIT DROP AS (
> SELECT * FROM blah blah blah -- whatever the cursor is defined as doing
> );
> ALTER TABLE table2 ADD PRIMARY KEY (id);
> CREATE INDEX table2_blah_idx ON table2 USING btree (blah);
> ANALYZE table2;
>
> Then, when I need to use the results in another query, I could do:
>
> SELECT * FROM table1 INNER JOIN table2 ON ( blah blah ) WHERE blah
>
> This would use the indexes and the primary key appropriately. I could also ensure that the order of the information in the temporary table is such that it facilitates any joining, where clauses, or order by clauses on the additional queries. Finally, to get results into my application, I would then do:

I have had good luck with temp tables, but beware -- there isn't anything special performance wise about them -- they do as much I/O as a real table without optimizations that know that it will be dropped on commit so it doesn't have to be as fail-safe as ordinary ones. Even so, a quick
CREATE TABLE foo ON COMMIT DROP AS (SELECT ...);
ANALYZE foo;
SELECT FROM foo JOIN bar ... ;
can be very effective for performance.

However, creating the indexes above is going to slow it down a lot. Most likely, the join with a seqscan will be faster than an index build followed by the join. After all, in order to build the index it has to seqscan! If you are consuming these tables for many later select queries rather than just one or two, building the index might help. Otherwise its just a lot of extra work.

I suggest you experiment with the performance differences using psql on a specific use case on real data.

> One final question:
>
> In this conversion to temporary table use, there are a couple of cases where I would prefer to do something like:
>
> prepare blah(blah blah) as select blah;
>
> Then, I want to call this prepared statement multiple times, passing a different argument value each time. The only reason to do this would be to save writing code and to ensure that updating the select statement in once place covers all places where it is used. However, I am concerned it might incur a performance hit by re-preparing the query since I assume that having this inside the PL/PGSQL procedure means it is already prepared once. Can anyone speak to this? I know that I could put it in a separate stored procedure, but then the question becomes, does that add extra overhead? Or, in different words, is it similar to the difference between an inlined function and a non-inlined function in C?

I can't speak for the details in your question, but it brings up a different issue I can speak to:
Prepared statements usually cause the planner to create a generic query plan for all possible inputs. For some queries where the parameters can significantly influence the query plan, this can be a big performance drop. For other queries (particularly inserts or simple selects on PK's) the cached plan saves time.

> I would greatly appreciate any insights to these questions/issues.
>
> Thanks in advance for any assistance anyone can provide.
>
>
> --
> Eliot Gable
>
> "We do not inherit the Earth from our ancestors: we borrow it from our children." ~David Brower
>
> "I decided the words were too conservative for me. We're not borrowing from our children, we're stealing from them--and it's not even considered to be a crime." ~David Brower
>
> "Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live; not live to eat.) ~Marcus Tullius Cicero

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Vlad Arkhipov 2010-04-22 09:25:46 Optimization idea
Previous Message Robert Haas 2010-04-22 01:21:28 Re: Replacing Cursors with Temporary Tables