Re: Inserting large BLOBs via JDBC - OutOfMemoryError

From: hhaag(at)gmx(dot)de
To: Barry Lind <barry(at)xythos(dot)com>
Cc: hhaag(at)gmx(dot)de, pgsql-jdbc(at)postgresql(dot)org, pgsql-jdbc-owner(at)postgresql(dot)org
Subject: Re: Inserting large BLOBs via JDBC - OutOfMemoryError
Date: 2002-08-16 08:01:04
Message-ID: 24814.1029484864@www9.gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

>While "new StringBuffer(p_buf.length)" is probably an improvement, it is
>difficult to predict what size buffer you will really need. This is
>because depending on the data you will see between zero and four times
>data expansion. Because the protocol postgres uses to talk between the
>client and server is string based, the binary data needs to be encoded
>in an ascii safe way. The encoding for the bytea datatype is to use
>\OOO octal escaping. Therefore each byte of data may take up to four
>bytes in the output. However if the data is mostly printable 7bit ascii
>bytes then there will be little expansion.
>
>I think your idea of initializing the buffer to be the size of the
>byte[] is a good idea. I will apply that change unless someone has a
>better suggestion.

I think it's at least better than initializing the stringbuffer with the
default capacity, which is 16. And as long as the stringbuffer is used only
internally (as a local variable) in a private method, no other parts of the code
should be affected. Of course you cannot predict the final size of the
created string.

There are also other places where StringBuffer usage could be improved in my
opinion:

(1) org.postgresql.jdbc1.AbstractJdbc1Statement#setString()

// Some performance caches
private StringBuffer sbuf = new StringBuffer();
...

current:

public void setString(int parameterIndex, String x) throws SQLException {
....
synchronized (sbuf) {
sbuf.setLength(0);

proposed:

StringBuffer sbuf = new StringBuffer(x.length());

--> use a local, non-synchronized variable. initialize the stringbuffer with
a smart capacity.

please note that I have not fully explored the usage of synchronized and the
re-usage of the stringbuffer. but as the synchronized keyword indicates,
this variable will only be accessed by one thread at a time. additionally the
actual contents of the stringbuffer are always disposed at the beginning of a
method. so a local variable should be fine - and faster than a synchronized
instance variable

(2) org.postgresql.jdbc1.AbstractJdbc1Statement#compileQuery()

protected synchronized String compileQuery()
throws SQLException
{
sbuf.setLength(0);
int i;

if (isFunction && !returnTypeSet)
throw new PSQLException("postgresql.call.noreturntype");
if (isFunction) { // set entry 1 to dummy entry..
inStrings[0] = ""; // dummy entry which ensured that no one overrode
// and calls to setXXX (2,..) really went to first arg in a function
call..
}

for (i = 0 ; i < inStrings.length ; ++i)
{
if (inStrings[i] == null)
throw new PSQLException("postgresql.prep.param", new Integer(i + 1));
sbuf.append (templateStrings[i]).append (inStrings[i]);
}
sbuf.append(templateStrings[inStrings.length]);
return sbuf.toString();
}

also in this case the stringbuffer should be initialized with a smart
capacity.

something like the sum of all string lengths to be appended. I'm a bit in a
rush today, but I'll try to find an algorithm in the next few days

--

GMX - Die Kommunikationsplattform im Internet.
http://www.gmx.net

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message hhaag 2002-08-16 08:09:48 Re: Inserting large BLOBs via JDBC - OutOfMemoryError
Previous Message Barry Lind 2002-08-15 21:01:57 Re: Exception retrieving timestamp without timezone value