Re: "stored procedures" - use cases?

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Peter Eisentraut" <peter_e(at)gmx(dot)net>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: "stored procedures" - use cases?
Date: 2011-04-26 22:49:50
Message-ID: 4DB705BE020000250003CECC@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote:
>> (1) All the \d commands in psql should be implemented in SPs so
>> that they are available from any client, through calling one SP
>> equivalent to one \d command.
>
> You don't need stored procedures with special transaction behavior
> for this. In fact, you probably shouldn't use them even if you
> had them, because you surely want a consistent view of, say, a
> table.

Agreed. I was just outlining use cases here, not trying to make a
case for something in particular with each one.

>> (2) In certain types of loads -- in particular converting data
>> from old systems into the database for a new system -- you need
>> to load several tables in parallel, with queries among the tables
>> which are being loaded. The ability to batch many DML statements
>> into one transaction is important, to avoid excessive COMMIT
>> overhead and related disk output; however, the ability to ANALYZE
>> tables periodically is equally important, to prevent each access
>> to an initially-empty table from being done as a table scan after
>> it has millions of rows. VACUUM might become equally important
>> if there are counts or totals being accumulated in some tables,
>> or status columns are being updated, as rows are added to other
>> tables.
>
> I'm not sure I really follow this. If your aim is to batch DML
> statements and avoid COMMIT overhead, why would you want to use
> stored procedures that possibly span multiple transactions?

The point is that if such a conversion is run in a situation where
table access is always done on a plan based on empty tables, it
starts to get pretty slow after a while. You need to commit,
analyze, and start a new transaction for the queries to make new
plans which run well. This obviously isn't an issue when you're
blasting entire tables in through COPY commands without needing to
reference other data being concurrently loaded.

So, rough pseudo-code where this is done in a client app with
autovacuum disabled would look something like:

open input stream of non-normalized data
open database connection
while not EOF on input
start transaction
for 50000 top level inputs (break on EOF)
parse apart messy data, load into multiple tables
(logic involves queries against tables being loaded)
(some updates besides straight inserts)
(print exceptions for questionable or undigestable data)
end for
commit transaction
vacuum analyze
end while

In database products with stored procedures it has usually been
faster to use an SP in the target database than to use a client
program.

-Kevin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2011-04-26 22:55:38 Re: "stored procedures" - use cases?
Previous Message Tom Lane 2011-04-26 22:28:55 Re: "stored procedures" - use cases?