| From: | Milan Oparnica <milan(dot)opa(at)gmail(dot)com> | 
|---|---|
| To: | pgsql-sql(at)postgresql(dot)org | 
| Subject: | Re: PERSISTANT PREPARE (another point of view) | 
| Date: | 2008-07-25 12:58:34 | 
| Message-ID: | g6cilk$2chv$1@news.hub.org | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-sql | 
Craig Ringer wrote:
> Milan Oparnica wrote:
>> I found this link from IBM DB2 developers showing why PERSISTENT 
>> PREPARE is a good idea and how could it be implemented.
> 
> [snip]
> 
>> NONE OF POPULAR SQL DBMS (Oracle, MS SQL, MySQL, Postgre, INTERBASE, 
>> FIREBIRD) HAVE THIS FEATURE.
>>
>> WHY ?
> 
> I suspect that people tend to use SQL or PL/PgSQL stored procedures 
> instead. I'm not 100% sure SQL functions cache their query plans, but I 
> know PL/PgSQL does.
> 
> Exactly what is gained by the use of persistent prepare over the use of 
> a stored procedure?
> 
Its easier to write and call prepared statements then SP's.
When writing, you just create parameters and SQL body of the statement.
When calling from the application or report engine, all you have to do 
is "EXEC <statement name> (<parameters>)".
In case of SP's written as SET OF CustomDataType it's pretty easy to 
call ("SELECT * FROM <proc> (<params>)) but its hard to maintain 
CutomDataType when changing table structures.
In case of SP's written using CURSORS calling convention is not simple.
BEGIN;
SELECT <proc>('cursor_name', <input parameter>);
FETCH ALL FROM cursor_name;
END;
This usually requires using more complicating ways to define source 
recordsets in reports designers. I'm also not sure how cursors use 
indexes of the tables. I've run some tests on several millions records 
with and without indexes and found out that there was no significant 
difference in performance. Plus, for now, CURSORS remain READ-ONLY in PG.
There is one thing more. Prepared statements could be used as table-type 
  (or view-type) datasource. That means we could run prepared statements 
over other prepared statements accepting same parameters. This makes 
possible creating complex "low-level" BASE queries and dozens of 
reporting queries that rely on them. Something like subquery structures:
SELECT A.Fld1,B.Fld2
FROM
(SELECT Fld1 FROM Table1 WHERE xxx) A INNER JOIN Table2 B ON A.Fld1=B.Fld2
WHERE yyy
> What would the interface to the feature be through database access 
> drivers like JDBC? Explicit PREPARE GLOBAL or similar, then invocation 
> with EXECUTE ?
> 
They could remain the same. If possible, when calling EXECUTE <xxx> PG 
could try local-then global (or reverse) PREPARED STATEMENTS pool, and 
run the statement that was first found. This is greatly simplified. It 
would require much detailed planning if this option is to be built in PG.
> How would users using increasingly common layers like Hibernate/JPA use it?
> 
I don't now, I never used Hibernate/JPA. Somebody could speak for that. 
Making PERSISTANT PREPARE statement available in PG doesn't mean that 
everybody must use it.
> I'm also curious about how you'd address the possible need for periodic 
> re-planning as the data changes, though AFAIK SQL functions suffer from 
> the same limitation there.
> 
Perhaps there could be several ways. It could be implemented within 
VACUUM procedure. It could re-plan all global prepared statements 
according to new circumstances. There should also be a manual 
re-planning command of specific statement maybe even with passing 
parameters to it so the optimization could be even more efficient.
> I guess I personally just don't understand what the point of the 
> persistent prepare feature you describe is. However, this post that you 
> linked to:
> 
> http://archives.postgresql.org/pgsql-hackers/2008-04/msg00867.php
> 
> actually describes a query plan cache, rather than persistent prepare. 
> The post assumes the app will explicitly manage the cache, which I'm not 
> sure is a good idea, but I can see the point of a plan cache. There 
> might be some heuristics Pg could use to decide what to cache and to 
> evict (planner time cost vs memory use, frequency of use, etc) so the 
> app doesn't have to know or care about the plan cache.  However, I'm not 
> too sure how you'd match an incoming query to a cached plan, and 
> determine that the plan was still valid, with enough speed to really 
> benefit from the plan cache. Then again, I don't know much about Pg's 
> innards, so that doesn't mean much.
> 
> Tom Lane responded to that post to point out some of the complexities:
> 
> http://archives.postgresql.org/pgsql-hackers/2008-04/msg00868.php
> 
> -- 
> Craig Ringer
> 
I think persistent prepare has to do something with global caching 
plans. Preparing statement actually prepares a plan of its execution 
giving us the possibility to execute it. With persistent prepare 
statement it somehow these plans have to go from local to global. I 
agree they shouldn't be managed by app itself. App can only create 
requests for creation, modification, execution and removing of such object.
This post is about a point of view. I know we can use stored procedures 
and views and other good features of PG to manipulate all data we need.
What I aim is: it can be done in more simple and efficient way with 
PERSISTENT PREPARE implementation. Let me point the key benefits:
- huge number of database developers could use this technique without 
getting involved with CURSORS and CUSTOM DATA TYPE SETS.
This means a lot of hard-coded sQL commands built into apps itself 
easily transfered to database layer.
It also means thousands of Microsoft Jet based applications migrated to 
PG. Why ? Because databases are getting larger as project advance and 
companies grow, and PG could offer an easier way to upsize then their 
own Microsoft SQL engine does.
It also means a lot of new and less experienced database programmers 
turn their focus on PG. Its much easier to understand PREPARED 
statements than SP constructions returning rowsets.
- EXEC(UTE) is a well known SQL command understood and implemented by 
any database engine almost the same way. It's also natively supported by 
report designing tools and ADO and ODBC drivers. It means that front end 
app doesn't even have to notice if the statement is locally or globally 
prepared.
This means that some modules of the system could create global 
statements and other modules would simply use them, assuming that they 
are in the DB as a part of "setup" process.
Lots of functionality is implemented in PG not all of them used by 
developers. My opinion is that implementation of PERSISTENT or GLOBAL 
PREPARED STATEMENT would be of much use.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Emi Lu | 2008-07-25 14:27:45 | Re: Query prepared plan | 
| Previous Message | A. Kretschmer | 2008-07-25 12:23:50 | Re: A table with two names or table alias |