Skip site navigation (1) Skip section navigation (2)

Prepared Statements and large where-id-in constant blocks?

From: James Robinson <jlrobins(at)socialserve(dot)com>
To: pgsql-jdbc(at)postgresql(dot)org
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Prepared Statements and large where-id-in constant blocks?
Date: 2004-04-19 15:56:06
Message-ID: 11531DB4-921A-11D8-8696-000A9566A412@socialserve.com (view raw or flat)
Thread:
Lists: pgsql-jdbcpgsql-sql
Howdy:

	Java middlewares like JBossCMP issue many queries in the general form 
of:

		SELECT t1.att1, t1.att2 ... t1.attN FROM t1 WHERE (t1.id = X) or 
(t1.id = Y) ....

where there may be anywhere between 1 and thousands of "(id = N)" 
blocks ORed together. These may be transformed to the "WHERE t1.id IN 
(X, Y, ...)" form for possibly a little performance gain (possibly -- 
I've not yet checked to see if this plans better than the other, but I 
could imagine this form being parsed into the hashjoin form as opposed 
to a huge index filter form).

	Real performance gains, however, could be attained through being able 
to ultimately use either the v2 PREPARE / EXECUTE statements or the 
better v3 protocol's bind / execute commands, but only if the SQL-level 
form of the query could better represent the fact there are not really 
N params, but, rather, one single param of type Set (or, heck, Array?). 
This would let all such queries map onto one single backend prepared 
statement, regardless of the size of the id set being passed in.

	I guess that separate preparation for each different cardinality would 
be okay performance-wise, but if there were some way to get all such 
queries factored-down into one single planned statement, it could:

1) Make far better use of JBoss's connection-wide LRU cache of 
PreparedStatements, since only one entry (with much higher cache hit 
rate) could exist for the entire family of queries.

2) Make better use of backend memory, since it only needed to prepare 
one such (generic) form, as opposed to one for each cardinality.


Problems in implementation:

	1) JBoss query engine would need to be educated about the potential to 
use this form as opposed to the "OR (t1.id=X)" form. Likewise, JBoss 
could / should well be educated about being able to use the "WHERE 
t1.id IN (X, Y, ...)" form for databases which support "WHERE .. IN ( 
.. )", probably an easier sell since this is most likely supported by 
more DBs than just PG.

	2) Does the JDBC spec allow any hooks for passing in a set of ids as 
one single param? We'd need the SQL-template to be prepared to look 
something like:

	SELECT t1.attr1 FROM t1 where t1.id in ( ? )

 From memory, perhaps setArray() might could be hacked for the job. I 
know JBossCMP uses
the setObject() call, so perhaps JDBC could be tricked out to handle a 
java.util.Collection, an arguably cleaner way to do it -- no backward 
compat issues since could be all-new functionality. JDBC driver could 
just iterate through the collection contents, calling setObject 
accordingly. Harder part would be educating JBoss to do this. Hardest 
part would be convincing someone to commit it into JBoss.

	3) Can a cardinality-free plan even be made? I bet I'm assuming a 
little too much in asserting all such plans are equal, but I suspect 
that Tom is going to tell me that the query for just one id value would 
and should be planned differently from the 800-value form, since the 
800-value form might well prefer a full sequential scan, since the 
table might only have 900 live rows in it.

Anyone have any insight or opinions?

[ crossposted to pgsql-sql for anyone's insight into the pure SQL / 
planning matters. Apologies in advance ].

----
James Robinson
Socialserve.com


Responses

pgsql-sql by date

Next:From: Gary StainburnDate: 2004-04-19 16:13:22
Subject: Re: three-way join - solved
Previous:From: CoLDate: 2004-04-19 15:47:07
Subject: Re: Concatenate results of a single column query

pgsql-jdbc by date

Next:From: Dave CramerDate: 2004-04-19 16:27:53
Subject: Re: Bad timestamp external representation ''
Previous:From: Katalina MarcosDate: 2004-04-19 15:44:41
Subject: RV: Bad timestamp external representation ''

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group