Re: [JDBC] Prepared statement performance...

From: Barry Lind <barry(at)xythos(dot)com>
To: Aaron Mulder <ammulder(at)alumni(dot)princeton(dot)edu>
Cc: Dave Cramer <Dave(at)micro-automation(dot)net>, Peter Kovacs <peter(dot)kovacs(at)sysdata(dot)siemens(dot)hu>, Dmitry Tkach <dmitry(at)openratings(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>, "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: [JDBC] Prepared statement performance...
Date: 2002-09-27 18:43:47
Message-ID: 3D94A6E3.2060100@xythos.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-jdbc

Aaron,

It is a bit more complex than just looking for a ; since a ; is a valid
character in a quoted string. But as Bruce has mentioned in a followup,
psql does it so it is doable.

I think the real question here is when does it make sense to use server
side prepared statements. In the little bit of testing I have done, I
would say the answer is rarely. You need many factors to come into
place for it to make sense to use server side prepared statements:

1) The statement needs to be big and complex such that there is
significant overhead in the parsing and planning stages of execution.
The query that I test with is about 3K in size and joins about 10
different tables and includes a union. In this case there is a
significant overhead involved in both parsing and planning. However
for a simple query that affects only one or two tables there is little
overhead.

2) The statement object needs to be reused multiple times. Using a
server prepared statement requires at a minimum three sql statements to
be executed to do the work of one original statement:
select foo from bar;
becomes
prepare <name> as select foo from bar;
execute <name>;
deallocate <name>;

We can do the first two together in one roundtrip to the server, but the
last one requires a separate roundtrip. So if you are only using a the
statement object/query only once then using server side prepared
statements will make performance worse than not. The vast majority of
statement objects are created, executed once and then closed. I don't
think it makes sence to turn on a feature that will make the overall
performance for most users worse. There is very little jdbc code that I
have seen that creates a statement, call execute multiple times on that
same statement object and then finally closes it.

This is the first version of the feature. Improvements will come with
implementation feedback and I welcome all feedback. But there are
reasons it is implemented the way it is. In the long term what I would
really like to see is this functionality done at the BE/FE protocol
level instead of at the sql level as I have seen other databases do.

Someone has suggestted a jdbc url arguement to enable the functionality
by default and that is probably a good idea if we learn from real use
that it makes sense to have all statements use this. But I am not
convinced (given my reasons above) that this really makes sense.

thanks,
--Barry

Aaron Mulder wrote:
> Dave, Barry, et al.:
>
>
>>Unless of course we were to write a
>>sql parser in the jdbc driver so that we could predetermine which
>>statements should use the prepare mechanism or not.
>
>
> That's a bit of an exaggeration -- all you need to check in the
> JDBC driver is whether there's an unescaped ; to see if there's more than
> one statement, right? It wouldn't be unreasonable to default to enabling
> the feature unless there's a ; in the statement (or unless there's a ;
> with anything but whitespace afterward, if you like) -- it wouldn't even
> be a disaster to forget escaping and just decide based on a ; in the
> statement period -- few enough statements have an escaped ; that the
> suboptimal performance in that case won't matter.
> I'm a little concerned about how this feature interacts with
> connection pooling as currently implemented -- it probably needs to be
> reset to a known state every time a connection is returned to the pool.
> Which is OK for the PostgreSQL implementation (which you'll note is not
> supposed to be used in an app server environment), but impossible for all
> the app servers out there which won't take driver-specific steps when
> connections are returned to their generic pools. Furthermore, with the
> default behavior set to "disabled", you're reducing performance (you
> quoted 60%?) in the application server environment, when there will never
> be multiple statements executed in the same call, but there will
> frequently be multiple identical statements executed on the same
> PreparedStatement, but there's no available way to reset the flag, short
> of writing a wrapper driver to stand between the app server and the real
> driver. On the other hand, if you set the default to a global "enabled",
> it breaks DBVisualizer and other development tools which currently allow
> multiple statements by default.
> IMHO, it would be better to have the feature enabled for queries
> where it's appropriate, and disabled where not, and let the programmer
> specify a global override where they absolutely must. That way
> DB-specific actions are only required in truly exceptional cases, and both
> the app servers and the tools behave properly by default.
> If you absolutely object, I think we should add a property to the
> Driver and the ConnectionPoolDataSource so that in an app server
> environment you can set the property to enabled at configuration time.
> But again, I would prefer smarter default behavior.
>
> Aaron
>
> On 27 Sep 2002, Dave Cramer wrote:
>
>>Peter,
>>
>>I have to agree with Barry here. Since it is possible to send two (or
>>more ) statements to the backend, the programmer needs some mechanism
>>for enabling/disabling this feature. Unless of course we were to write a
>>sql parser in the jdbc driver so that we could predetermine which
>>statements should use the prepare mechanism or not.
>>
>>I often use prepared statements instead of simple statements even when
>>only doing one select. The difference being that I can use setXXX to set
>>the parameters in the statment. Which is particularly important for
>>setDate, setTime, setTimestamp
>>
>>And you would be amazed at the unique (irregular) coding practices which
>>exist out there, so having the default do something sane is a GOOD
>>THING.
>>
>>Dave
>>
>>
>>On Fri, 2002-09-27 at 02:52, Peter Kovacs wrote:
>>
>>>Barry,
>>>
>>>It's great to have this functionality.
>>>
>>>However, I am surprised about the need/requirement to explicitely turning on
>>>this feature in the JDBC driver. As you explain, the reason for this
>>>behaviour is to support the use of multiple statements with the
>>>prepareStatement method. But this is supporting incorrect us of the JDBC
>>>interface, since the JDBC API says that the parameter of the
>>>prepareStatement method is one(!) SQL statetment. On the other hand, it is
>>>clear that the intended purpose of the prepareStatement method is to use
>>>prepared statment -- exclusively and not by default. For statements executed
>>>only once, the designers of the API made available the method
>>>"createStatment".
>>>
>>>So the buttom line is that you have an irregular behaviour in support of an
>>>irregular coding practice. I understand that there is "legacy" code that
>>>needs to use the prepareStatement method with multiple SQL statements. But
>>>since such practice was already incorrect at the time of its inception, the
>>>right way would be to allow a transition from bad to good by requiring the
>>>"sinners" to explicitly use some kind of
>>>"org.postgresql.PGStatement.setUseServerPrepare(false)" tweaking in their
>>>code, and let new coders use standard code.
>>>
>>>Of course, you're running the show, so you will do it as you deem right, but
>>>IMHO the implementation as you described it is clearly not reasonable.
>>>
>>>Cheers,
>>>
>>>Peter
>>>
>>>----- Original Message -----
>>>From: "Barry Lind" <barry(at)xythos(dot)com>
>>>To: "Dmitry Tkach" <dmitry(at)openratings(dot)com>
>>>Cc: <pgsql-general(at)postgresql(dot)org>; <pgsql-jdbc(at)postgresql(dot)org>
>>>Sent: Thursday, September 26, 2002 10:56 PM
>>>Subject: Re: [JDBC] [GENERAL] Prepared statement performance...
>>>
>>>
>>>
>>>>Dimitry,
>>>>
>>>>I have some good news for you. All of this functionality is in 7.3 of
>>>>the server and the current development build of the jdbc driver. Of
>>>>course 7.3 isn't quite production yet, but beta 2 should be out this week.
>>>>
>>>>Your first option is what has been implemented. In 7.3 the server now
>>>>supports 'prepare <name> as <sql>' and 'execute <name>' (and variants
>>>>for passing bind variables). The jdbc driver also now has a method on
>>>>org.postgresql.PGStatement.setUseServerPrepare(true) to enable the the
>>>>use of server side prepared statements (so you can cast your Statement
>>>>object to an org.postgresql.PGStatement object and enable the
>>>>functionality for a specific statement).
>>>>
>>>>This hasn't yet made its way into the jdbc documentation for 7.3, but
>>>>this will be documented before 7.3 goes production.
>>>>
>>>>This implementation does still require that you make some changes to
>>>>your app in order to enable it but that is by design for the following
>>>>reasons:
>>>> 1) In order to be useful you need to reuse a statement object
>>>>multiple times. Most code generally will just create a statement and
>>>>use it only once, and since more work is being done to prepare the
>>>>statement that would have a negative performance impact for the majority
>>>>of statements being executed in normal code.
>>>> 2) Prepare functionality will not work for all types of sql that you
>>>>can execute through the jdbc driver. For example you can quite easily
>>>>do the following: conn.prepareStatement("update foo set a = ?; insert
>>>>into bar values (?,?,?);");
>>>>By issuing both sql statements in one call only one network roundtrip is
>>>>needed to execute both statements providing what is likely a more
>>>>performant application. However this doesn't work with prepare because:
>>>>prepare foo as update foo set a = ?; insert into bar values(?,?,?); -
>>>>is one prepared statement and one regular statement which will cause
>>>>very strange things to happen.
>>>>
>>>>It is for the reasons above that the developer needs to explicitly turn
>>>>on the use of server side prepared statements when it makes sense to do
>>>
>>>so.
>>>
>>>>thanks,
>>>>--Barry
>>>>
>>>>PS. I would really appreciate it if you could test this new
>>>>functionality. As far as I know I am the only one who has used it. But
>>>>for one of my frequently used and complex sql statements the time to
>>>>execute dropped by 60% using server side prepared statements.
>>>>
>>>>
>>>>
>>>>Dmitry Tkach wrote:
>>>> > Hi, everybody.
>>>> >
>>>> > I am running into huge performance problems, due to JDBC not being able
>>>> > to cache query plans :-(
>>>> > My java program runs the (set of about 5) identical statements with
>>>> > different parameters for about 30 million times...
>>>> > What I am talking about below has to do with JDBC, but is not really
>>>> > specific to it - it seems to me, that the backend itself
>>>> > could be improved to better handle this kind of scenario, when the same
>>>> > statement is executed many times in the same session.
>>>> >
>>>> > It takes about 3 weeks(!) to run, and it looks like about *half* of
>>>
>>>that
>>>
>>>> > time is spent by the query planner,
>>>> > creating and recreating the query plans every time I ran the damn
>>>> > statement....
>>>> >
>>>> > I am looking into implementing some kind of a solution, that would let
>>>> > me work around that problem...
>>>> > So far, I only see two possibilities:
>>>> >
>>>> > - a general solution, that would involve extending postgres SQL gramma
>>>> > to include a 'prepare' statement
>>>> > - or am ugly work around, that would involve moving all my sql
>>>> > statements into stored procedures, and have
>>>> > those cache the query plans inside...
>>>> >
>>>> > The second solution is not only ugly (because it requires the
>>>> > application code to be changed and to have a specialized
>>>> > stored procedure for every query), but also requires some additional
>>>> > hacks (to overcome the hard limit on the number of
>>>> > function arguments and the inability for functions to return tuples) -
>>>> > the only way I imagine this can be made to work is
>>>> > to glue all the arguments together into a text string, and have the
>>>> > stored procedure parse it back, execute the query, then
>>>> > glue the resulting tuple(s) into another text string, return it, and
>>>> > have the application (or, perhaps, JDBC layer) parse it back
>>>> > into columns...
>>>> >
>>>> > I was wonderring if anybody has any better ideas how this can be made
>>>
>>>to
>>>
>>>> > work (I am looking for a solution that would
>>>> > minimize changes to the existing JDBC applications that use
>>>> > PreparedStatements)?
>>>> >
>>>> > If the maintainers of the involved code are interested, I would be
>>>> > willing to implement and contribute the solution we come up with ...
>>>> > (I figure, nobody would really be interested in getting that second
>>>> > solution I mentioned into the mainstream :-), but, if we are
>>>> > able to come up with something more general and less ugly, perhaps, I
>>>
>>>am
>>>
>>>> > not the only one who would be able to contribute from
>>>> > it)...
>>>> >
>>>> > For example, I believe, it should not be too complicated to implement
>>>> > that first possibility I described above...
>>>> > The way I see it would involve adding two statements to postgres SQL
>>>> > syntax:
>>>> >
>>>> > prepare <name> as <sql statement>
>>>> > and
>>>> > execute <name> with (<parameter list>)
>>>> >
>>>> > For example:
>>>> >
>>>> > prepare mystatement as select * from mytable where id = $1 and name
>>>
>>>like
>>>
>>>> > $2;
>>>> >
>>>> > and then
>>>> >
>>>> > execute mystatement with (1, 'Dima');
>>>> > execute mystatement with (2, 'John');
>>>> >
>>>> > etc....
>>>> >
>>>> > The JDBC driver would then send the 'prepare' command to the backend in
>>>> > Connection.prepareStatement (), and
>>>> > use the 'execute' in PreparedStatement.execute ();
>>>> >
>>>> > One potential problem with implementation I see here is that the query
>>>> > planner wants to know the argument types ahead of time...
>>>> > I guess, I could get around that by making all the arguments 'text',
>>>
>>>and
>>>
>>>> > having them casted into the right types when the
>>>> > statement is actually executed.
>>>> >
>>>> > There is, probably a need to also have some kind of a 'close' command
>>>
>>>to
>>>
>>>> > throw away the prepared query plans... or we could just
>>>> > make them last forever until, say, the end of transaction (or, perhaps,
>>>> > the end of the session?)...
>>>> >
>>>> > If there is anyone interested in discussing various possibilities, and
>>>> > getting this implemented one way or another,
>>>> > I would like to hear from you!
>>>> >
>>>> > Thanks!
>>>> >
>>>> > Dima.
>>>> >
>>>> >
>>>> >
>>>> > ---------------------------(end of
>>>
>>>broadcast)---------------------------
>>>
>>>> > TIP 2: you can get off all lists at once with the unregister command
>>>> > (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>>>> >
>>>>
>>>>
>>>>
>>>>---------------------------(end of broadcast)---------------------------
>>>>TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
>>>>
>>>
>>>
>>>
>>>---------------------------(end of broadcast)---------------------------
>>>TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
>>>
>>>
>>
>>
>>
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 5: Have you checked our extensive FAQ?
>>
>>http://www.postgresql.org/users-lounge/docs/faq.html
>>
>
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Magnus Naeslund(f) 2002-09-27 18:44:16 Re: How do i make use of listen/notify properly
Previous Message Barry Lind 2002-09-27 18:24:39 Re: [JDBC] Prepared statement performance...

Browse pgsql-jdbc by date

  From Date Subject
Next Message Barry Lind 2002-09-27 18:50:43 Re: [JDBC] Prepared statement performance...
Previous Message Barry Lind 2002-09-27 18:24:39 Re: [JDBC] Prepared statement performance...