Skip site navigation (1) Skip section navigation (2)

Re: Very strange performance decrease when reusing a PreparedStatement

From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Péter Kovács <maxottovonstirlitz(at)gmail(dot)com>
Cc: Dave Cramer <pg(at)fastcrypt(dot)com>, Frédérik Bilhaut <frederik(dot)bilhaut(at)noopsis(dot)fr>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Very strange performance decrease when reusing a PreparedStatement
Date: 2009-05-03 23:43:23
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-jdbc
Péter Kovács wrote:
> On Sun, May 3, 2009 at 12:10 PM, Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
>>> It appears that the Postgres "server-prepared statement" cannot handle
>>> parameters to the statement. This is really unfortunate, because 99%
>>> of real-life applications will want to re-use the same statement
>>> (template) with different parameters.
>>> The term "server-prepared statement" itself already indicates that
>>> there may be something skewed about the "local" semantics of
>>> java.sql.PreparedStatements in the Postgres JDBC driver. There is no
>>> notion of "client-prepared statement" in the JDBC API, which conceives
>>> PreparedStatement instances as mere handles to server side objects.

There's no requirement in JDBC that PreparedStatement be merely a handle
to a server-side object - that's just one implementation option, and a
poor one to choose for the PostgreSQL wire protocol in its current form.
At a bare minimum, we cache much of the statement and parameter
information on the client side. (Can you imagine every setXXX call doing
a network round-trip, really?)

>>> And indeed, Postgres JDBC users have historically been using
>>> java.sql.PreparedStatements for its side-effect of preventing SQL
>>> injection rather than for the purpose the JDBC API designers had in
>>> mind with this class.
>> I'm not sure where this hypothesis is coming from. Postgresql server
>> prepared statements can certainly handle parameters.
>> What makes you think it can't ?
> "Server side prepared statements are planned only once by the server.
> This avoids the cost of replanning the query every time, but also
> means that the planner cannot take advantage of the particular
> parameter values used in a particular execution of the query. You
> should be cautious about enabling the use of server side prepared
> statements globally."

I don't think you're understanding that correctly. It has nothing to do
with how parameters are passed.

Are you familiar with how the wire protocol works?

If you have a query:

  SELECT * FROM foo WHERE someindexedvalue=?

When the unnamed statement is used, a new query plan is constructed when
parameters are bound to the statement before execution, and the
*particular* parameter value provided is given to the query planner.

The query planner then make decisions based on that actual value. For
example, if the table statistics say that the particular value used is
very selective for that column, it may decide to use an index scan over
a table scan.

When a named statement is used, the server behavior changes. The query
is planned once, when the statement is created but before parameter
values are bound. The planner does not have particular parameter values
available, and it creates a general plan suitable for any parameter
value. This can result in different index selectivity information,
because the values that will be actually be used are not known ahead of
time .. which in turn can result in different plans. The resulting plan
may be "good" for any random value but "bad" for a particular value.
Often apprications are not passing in any random value - there are
external constraints on the value that are not visible in the structure
of the query. For example, maybe you only retrieve log data from that
5GB table with a timestamp range that selects a few minutes of data -
but without specific parameter values, the planner doesn't know that.

"server side prepared statement" is the JDBC driver jargon for "uses a
persistent, named, statement at the protocol level". The mechanism for
passing parameters remains the same in both cases - the difference is
that in one case a one-shot unnamed statement is used, in the other case
a persistent server-side named statement is used.

The distinction is there precisely because there *ARE* known performance
issues on the server side with some queries when using a named statements.

There is not much more the driver can do than give you tuning knobs for
the problematic cases. Anything else is going to have to happen on the
server side to make the planner smarter in these cases (and I'm not even
sure if it CAN be smarter with the information it has)


In response to


pgsql-jdbc by date

Next:From: Oliver JowettDate: 2009-05-03 23:48:21
Subject: Re: Very strange performance decrease when reusing a PreparedStatement
Previous:From: Oliver JowettDate: 2009-05-03 23:28:23
Subject: Re: Very strange performance decrease when reusing a PreparedStatement

Privacy Policy | About PostgreSQL
Copyright © 1996-2018 The PostgreSQL Global Development Group