When using the ps.setMaxRows() call on PreparedStatement, the jdbc
driver sets the row limit via the "raw" postgres protocol. However, in
the words of Tom Lane, "the row limit in the protocol only says how many
rows to deliver in the first batch. The presumption is that you'll
eventually grab the rest, and so the query is planned on that basis."
What this means that when we do the following query:
select action_id from actions order by action_id
with a ps.setMaxRows(100), it takes about 1.8 seconds for the query to
complete. However, if we do the following query:
select action_id from actions order by action_id limit 100
without any ps.setMaxRows() the query only takes 0.156 seconds. This is
more than a factor of 10 faster.
I'm not 100% sure of what JDBC says about setMaxRows (it's kind of
ambiguous in the java doc, as usual), but as far as I can tell, if you
call setMaxRows on the prepared statement there is no way in to ever
retrieve more than that number of rows. If this is indeed the case, it
seems to me that currently there is a mismatch between the JDBC api and
the postgresql api, and JDBC should somehow tell postgres that this is a
hard limit and it should not plan for a second batch.
Therefore, my question is: is this a bug? It is not feasable for me to
add LIMIT clauses to all the SQL queries in my code, so if this IS a
bug, I hope it can be fixed. If it is NOT a bug, is there an alternative
workaround that does not involve changing all of my sql statements?
Thanks in advance,
pgsql-jdbc by date
|Next:||From: Dave Cramer||Date: 2006-06-21 14:57:10|
|Subject: Re: Limit vs setMaxRows issue|
|Previous:||From: Heikki Linnakangas||Date: 2006-06-20 10:16:25|
|Subject: Re: statement caching proof of concept|