Re: ResultSet.getBytes() of bytea speedup batch

From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Mikko Tiihonen" <mikko(dot)tiihonen(at)iki(dot)fi>
Cc: <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: ResultSet.getBytes() of bytea speedup batch
Date: 2006-11-19 20:31:31
Message-ID: 4560BF23.9070003@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Mikko Tiihonen wrote:
> The attached patch gives 20-400% speedup depending on the bytea column
> size. The speedup comes from switching to the binary transfer mode
> instead of the text transfer mode for bytea columns.

Impressive results!

> The patch also contains the test code which I used to test the
> performance (ByteaBenchTest.java). The benchmark results are:
>
> cvs:
> column size: transfer speed:
> 512b speed: 3,8MB/s
> 2048b speed: 10,0MB/s
> 4096b speed: 13,1MB/s
> 16384b speed: 22,0MB/s
> 65536b speed: 22,4MB/s
> 1048576b speed: 21,2MB/s
>
> patch:
> column size: transfer speed:
> 512b speed: 4,6MB/s - 21% speedup
> 2048b speed: 16,8MB/s - 68% speedup
> 4096b speed: 27,9MB/s - 112% speedup
> 16384b speed: 79,2MB/s - 260% speedup
> 65536b speed: 111,1MB/s - 395% speedup (optimal for my L2 cache?)
> 1048576b speed: 74,8MB/s - 252% speedup

I'm getting similar results, though the drop after 65536b is even steeper:

cvs:
[junit] 512b speed: 1.8MB/s memory: 2.5MB
[junit] 2048b speed: 5.0MB/s memory: 2.0MB
[junit] 4096b speed: 7.5MB/s memory: 2.0MB
[junit] 16384b speed: 11.8MB/s memory: 2.4MB
[junit] 65536b speed: 12.6MB/s memory: 2.4MB
[junit] 1048576b speed: 5.3MB/s memory: 5.8MB

patch:
[junit] 512b speed: 2.1MB/s memory: 2.0MB
[junit] 2048b speed: 7.4MB/s memory: 2.2MB
[junit] 4096b speed: 13.0MB/s memory: 2.1MB
[junit] 16384b speed: 35.9MB/s memory: 2.3MB
[junit] 65536b speed: 56.7MB/s memory: 2.0MB
[junit] 1048576b speed: 17.2MB/s memory: 2.9MB
[junit] ------------- ---------------- ---------------

Ran on Sun JRE 1.5.0_06, on Intel Core Duo. I sure would like to find an
explanation for the drop.

> Still open questions:
> - is there any better way to obtain the result set field information in
> QueryExecutor?
> * could the result set type information be obtained during the
> prepare?

It could. If the driver issued a Sync message after Describe, and waited
for the response before sending the Bind, the driver could use the
information in the RowDesription response. But that would require an
extra round-trip to the server, so it's probably not worth it in most
queries.

I wish we could set a default format for each data type in a separate
message. Like:

SetFormat(oid = 17, format="b")

The server would then send all bytea fields as binary unless otherwise
stated in the Bind message. It would require a backend-change, of course.

> - in which conditions should a prepared statement drop the cached fields
> * do the result set fields ever change?
> * any other conditions?
> - if caching of Field information is allowed (at least in some
> circumstances), could the driver stop requesting a Describe from the
> backend?

I can't imagine how the fields could change. Seems safe to me.

> @@ -126,6 +134,8 @@
> private byte[] encodedStatementName;
> private PhantomReference cleanupRef;
> private int[] preparedTypes;
> + private static final byte[] EMPTY_STATEMENTNAME = new byte[] { 0 };
> + private Field[] fields;
>
> final static SimpleParameterList NO_PARAMETERS = new SimpleParameterList(0);
> }

This is just some leftover crud, right? Couldn't find a reference to
EMPTY_STATEMENTNAME anywhere.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-jdbc by date

  From Date Subject
Next Message Roland Walter 2006-11-19 21:42:07 Re: Use Driver to Create database?
Previous Message Mikko Tiihonen 2006-11-19 20:22:06 binary tuple receiving patch v2