Re: PERSISTANT PREPARE (another point of view)

From: Milan Oparnica <milan(dot)opa(at)gmail(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: PERSISTANT PREPARE (another point of view)
Date: 2008-07-25 12:58:34
Message-ID: g6cilk$2chv$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Craig Ringer wrote:
> Milan Oparnica wrote:
>> I found this link from IBM DB2 developers showing why PERSISTENT
>> PREPARE is a good idea and how could it be implemented.
>
> [snip]
>
>> NONE OF POPULAR SQL DBMS (Oracle, MS SQL, MySQL, Postgre, INTERBASE,
>> FIREBIRD) HAVE THIS FEATURE.
>>
>> WHY ?
>
> I suspect that people tend to use SQL or PL/PgSQL stored procedures
> instead. I'm not 100% sure SQL functions cache their query plans, but I
> know PL/PgSQL does.
>
> Exactly what is gained by the use of persistent prepare over the use of
> a stored procedure?
>

Its easier to write and call prepared statements then SP's.
When writing, you just create parameters and SQL body of the statement.
When calling from the application or report engine, all you have to do
is "EXEC <statement name> (<parameters>)".

In case of SP's written as SET OF CustomDataType it's pretty easy to
call ("SELECT * FROM <proc> (<params>)) but its hard to maintain
CutomDataType when changing table structures.

In case of SP's written using CURSORS calling convention is not simple.

BEGIN;
SELECT <proc>('cursor_name', <input parameter>);
FETCH ALL FROM cursor_name;
END;

This usually requires using more complicating ways to define source
recordsets in reports designers. I'm also not sure how cursors use
indexes of the tables. I've run some tests on several millions records
with and without indexes and found out that there was no significant
difference in performance. Plus, for now, CURSORS remain READ-ONLY in PG.

There is one thing more. Prepared statements could be used as table-type
(or view-type) datasource. That means we could run prepared statements
over other prepared statements accepting same parameters. This makes
possible creating complex "low-level" BASE queries and dozens of
reporting queries that rely on them. Something like subquery structures:

SELECT A.Fld1,B.Fld2
FROM
(SELECT Fld1 FROM Table1 WHERE xxx) A INNER JOIN Table2 B ON A.Fld1=B.Fld2
WHERE yyy

> What would the interface to the feature be through database access
> drivers like JDBC? Explicit PREPARE GLOBAL or similar, then invocation
> with EXECUTE ?
>

They could remain the same. If possible, when calling EXECUTE <xxx> PG
could try local-then global (or reverse) PREPARED STATEMENTS pool, and
run the statement that was first found. This is greatly simplified. It
would require much detailed planning if this option is to be built in PG.

> How would users using increasingly common layers like Hibernate/JPA use it?
>

I don't now, I never used Hibernate/JPA. Somebody could speak for that.
Making PERSISTANT PREPARE statement available in PG doesn't mean that
everybody must use it.

> I'm also curious about how you'd address the possible need for periodic
> re-planning as the data changes, though AFAIK SQL functions suffer from
> the same limitation there.
>

Perhaps there could be several ways. It could be implemented within
VACUUM procedure. It could re-plan all global prepared statements
according to new circumstances. There should also be a manual
re-planning command of specific statement maybe even with passing
parameters to it so the optimization could be even more efficient.

> I guess I personally just don't understand what the point of the
> persistent prepare feature you describe is. However, this post that you
> linked to:
>
> http://archives.postgresql.org/pgsql-hackers/2008-04/msg00867.php
>
> actually describes a query plan cache, rather than persistent prepare.
> The post assumes the app will explicitly manage the cache, which I'm not
> sure is a good idea, but I can see the point of a plan cache. There
> might be some heuristics Pg could use to decide what to cache and to
> evict (planner time cost vs memory use, frequency of use, etc) so the
> app doesn't have to know or care about the plan cache. However, I'm not
> too sure how you'd match an incoming query to a cached plan, and
> determine that the plan was still valid, with enough speed to really
> benefit from the plan cache. Then again, I don't know much about Pg's
> innards, so that doesn't mean much.
>
> Tom Lane responded to that post to point out some of the complexities:
>
> http://archives.postgresql.org/pgsql-hackers/2008-04/msg00868.php
>
> --
> Craig Ringer
>

I think persistent prepare has to do something with global caching
plans. Preparing statement actually prepares a plan of its execution
giving us the possibility to execute it. With persistent prepare
statement it somehow these plans have to go from local to global. I
agree they shouldn't be managed by app itself. App can only create
requests for creation, modification, execution and removing of such object.

This post is about a point of view. I know we can use stored procedures
and views and other good features of PG to manipulate all data we need.

What I aim is: it can be done in more simple and efficient way with
PERSISTENT PREPARE implementation. Let me point the key benefits:

- huge number of database developers could use this technique without
getting involved with CURSORS and CUSTOM DATA TYPE SETS.

This means a lot of hard-coded sQL commands built into apps itself
easily transfered to database layer.

It also means thousands of Microsoft Jet based applications migrated to
PG. Why ? Because databases are getting larger as project advance and
companies grow, and PG could offer an easier way to upsize then their
own Microsoft SQL engine does.

It also means a lot of new and less experienced database programmers
turn their focus on PG. Its much easier to understand PREPARED
statements than SP constructions returning rowsets.

- EXEC(UTE) is a well known SQL command understood and implemented by
any database engine almost the same way. It's also natively supported by
report designing tools and ADO and ODBC drivers. It means that front end
app doesn't even have to notice if the statement is locally or globally
prepared.

This means that some modules of the system could create global
statements and other modules would simply use them, assuming that they
are in the DB as a part of "setup" process.

Lots of functionality is implemented in PG not all of them used by
developers. My opinion is that implementation of PERSISTENT or GLOBAL
PREPARED STATEMENT would be of much use.

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Emi Lu 2008-07-25 14:27:45 Re: Query prepared plan
Previous Message A. Kretschmer 2008-07-25 12:23:50 Re: A table with two names or table alias