Re: PreparedStatement parameters and mutable objects

From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Paul Thomas <paul(at)tmsl(dot)demon(dot)co(dot)uk>
Cc: "pgsql-jdbc (at) postgresql (dot) org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: PreparedStatement parameters and mutable objects
Date: 2004-01-13 02:02:41
Message-ID: 400351C1.7020609@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Paul Thomas wrote:
> On 12/01/2004 00:39 Oliver Jowett wrote:
>> The problem is that it's very specific to the application workload;
>> what case do we measure?
>
> I think you would need something which closely models you app's work load.
>
>> The reason I'm interested in doing this for is not the direct CPU
>> overhead of object creation (none of the JDBC code is on our main
>> execution path), but the effect that object creation has on GC
>> interval and pause. We're running a low-latency app where an extra
>> 50ms pause due to GC has a large impact on our latency figures .. so
>> the less garbage we generate in the first place, the better. We could
>> farm the JDBC work out to a separate VM, but that gets complex quite
>> fast.
>
>
> Reducing the amount of garbage created is never a bad thing. Do you have
> an estimate of the % reduction you think you could achieve?

See the attached testcase, it does batched inserts of byte arrays. I've
run this as:

java -Xrunhprof:heap=all,file=insertgarbage.txt,depth=10,cutoff=0 \
-verbose:gc -cp .:/net/java/java_libs/postgresql.jar \
TestInsertGarbage \
'jdbc:postgresql:rhino?user=oliver&password=oliver' \
100000 100 100

which does 100k inserts each with a 100 byte array, in batches of 100.
This is roughly the same as our application workload when it is doing a
DB resync of a reasonably-sized provisioning database.

In the resulting heap dump, the allocation hotspots are:

QueryExecutor.sendQueryV2 -> Encoding.encode -> String.getBytes
(100000 byte[], 143738400 bytes)
(100000 byte[], 37144800 bytes)
(100000 byte[], 20000000 bytes) // This is encoding the query text
Statement.setBytes -> setString -> escapeString -> StringBuffer.append
(63000 char[], 81608000 bytes)
(19200 char[], 25018400 bytes)
(17800 char[], 23091200 bytes)
Statement.setBytes -> setString -> StringBuffer.ensureCapacity
(100000 char[], 65445600 bytes)
Statement.setBytes -> PGbytea.toPGString -> StringBuffer.<init>
(100000 char[], 41600000 bytes)
Statement.setBytes -> PGbytea.toPGString -> StringBuffer.append
(21800 char[], 17788800 bytes)
(21400 char[], 17462400 bytes)
(21300 char[], 17380800 bytes)
(20800 char[], 16972800 bytes)
(14400 char[], 11750400 bytes)
QueryExecutor.executeV2 -> Jdbc3PreparedStatement.createResultSet
(100000 Jdbc3ResultSet, 13600000 bytes)

All other allocation points allocate <10MB. There's quite a bit of other
object allocation that could be cleaned up, e.g. repeated allocation of
QueryExecutor objects.

Total allocations were around 631800000 bytes. I estimate that streaming
the parameter data would generate about 499000000 bytes less garbage
(basically all of the above goes away except for encoding the query text
and creating the resultsets), which is a 79% reduction (this is actually
a lot more than I expected!).

There are improvements we can make here without streaming, though. Not
creating resultset objects for updates, and doing all necessary escaping
in a single pass rather than two, both seem like low-hanging fruit.

-O

Attachment Content-Type Size
TestInsertGarbage.java text/x-java 2.0 KB

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Dave Cramer 2004-01-13 02:07:33 Re: PreparedStatement parameters and mutable objects
Previous Message Oliver Jowett 2004-01-12 22:44:33 Re: PreparedStatement parameters and mutable objects