Re: ps.setCharacterStream() and memory usage

From: Vadim Nasardinov <vadimn(at)redhat(dot)com>
To: Sebastiaan van Erk <sebster(at)sebster(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: ps.setCharacterStream() and memory usage
Date: 2004-11-02 14:46:09
Message-ID: 200411020946.09434@vadim.nasardinov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

On Friday 29 October 2004 07:56, Sebastiaan van Erk wrote:
> Coming back to another problem I have with the following insert
> statement: (Postgres 7.4, driver build 215, jdbc 3)
>
> ps.setCharacterStream(1, reader, (int) messageFile.length());
> ps.executeUpdate();
>
> The reason I do this (using a character stream to load a text file
> to a TEXT field) is that I wish to avoid loading the file into
> memory completely.
[...]
> [...] the file actually gets copied THREE times into memory, causing
> my Java file to already get a java.lang.OutOfMemory error with an
> insert of a 10M text file (Java allocates 20M for this, since it has
> 2 byte chars, and the driver makes (at least) 3 copies (referenced
> at the same time, not allowing one to be GC'ed), making 60M).

I assume you're talking about
org/postgresql/jdbc2/AbstractJdbc2Statement.java which has this piece
of code (slightly reformatted to fit on a 80-column page):

public void setCharacterStream(int i, Reader x, int length)
throws SQLException {

checkClosed();
if (length < 0)
throw new PSQLException(GT.tr("Invalid stream length {0}.",
new Integer(length)));

if (connection.haveMinimumCompatibleVersion("7.2")) {
char[] l_chars = new char[length];
int l_charsRead = 0;
try {
while(true) {
int n = x.read(l_chars,
l_charsRead,
length - l_charsRead);
if (n == -1)
break;
l_charsRead += n;
if (l_charsRead == length)
break;
}
} catch (IOException l_ioe) {
throw new PSQLException(GT.tr("Provided Reader failed."),
PSQLState.UNEXPECTED_ERROR, l_ioe);
}
setString(i, new String(l_chars, 0, l_charsRead));
} else {

}
}

This does appear to be optimized for reading smallish chunks of text.
If the character source behind the Reader is sufficiently large, I
believe we'd be better off building up a StringBuffer rather than a
character array. Something like this:

if (connection.haveMinimumCompatibleVersion("7.2")) {
StringBuffer sb = new StringBuffer(length);
BufferedReader br = new BufferedReader(x);
try {
while(true) {
String chunk = br.readLine();
if (chunk == null) { break; }
else {sb.append(chunk); }
}
} catch (IOException l_ioe) {
throw new PSQLException(GT.tr("Provided Reader failed."),
PSQLState.UNEXPECTED_ERROR, l_ioe);
}
setString(i, sb.toString());
}

The reason this is better is because

(a) the String(char[],int,int) constructor always defensively copies
the passed in character array. So, as you point out, if you
pass in a 10-million-character array, a new 10-million character
array will be allocated.

(b) in contrast, if you construct a String from a StringBuffer, they
can share the underlying character array:
http://www.google.com/search?q=Heinz+Kabutz+StringBuffer+Issue+068&btnI

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Sebastiaan van Erk 2004-11-02 15:08:49 Re: ps.setCharacterStream() and memory usage
Previous Message Kris Jurka 2004-11-02 01:12:52 Re: JDeveloper