Re: OutOfMemory when inserting stream of unknown length

From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: "Mikko T(dot)" <mtiihone(at)cc(dot)hut(dot)fi>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: OutOfMemory when inserting stream of unknown length
Date: 2004-08-19 21:27:40
Message-ID: 41251B4C.9020205@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Mikko T. wrote:

> The javadoc itself (reproduced at the bottom of the mail) is a bit
> controversial as it at the same time says that the stream will have
> 'length' number of bytes, but on the other hand says that all data will
> be read. I have interpreted this so that the length is just a hint and
> jdbc driver must not store more bytes than the length, but an
> end-of-file before the length bytes
> is still valid and shouldn't cause any exceptions. And the javadoc might
> also mean that if the stream hasn't reached end-of-file when length
> bytes have been read the jdbc driver should still continue reading but
> discarding the remaining bytes.

This is not my interpretation (and not the driver's interpretation
either). The driver reads exactly 'length' bytes and does not
subsequently touch the stream. It is an error to provide less than
'length' bytes in the stream (and this will actually also toast your
server connection).

Your OOME is due to older drivers not having proper streaming logic --
they read into heap then call setBytes(). Newer drivers stream the data
directly to the server.

> The PreparedStatement.setBinaryStream(int parameterIndex,
> InputStream x,
> int length)
>
> Sets the designated parameter to the given input stream, which will have
> the specified number of bytes.

This sounds like a requirement to me -- i.e. it is an error to pass a
stream that does not have the specified number of bytes.

> .. The data will be read from the stream
> as needed until end-of-file is reached.

.. but as usual the JDBC javadoc goes on to contradict itself. sigh. I
wish sun could come up with a proper spec for a change, not just a
collection of partially-documented APIs.

As I see it the reason for having a length value there is so that the
driver can stream the data directly to the DB even when it needs to know
the length ahead of time. This is exactly the case with the postgresql
driver. If we can't trust the length field to be accurate, then we must
read the entire stream into heap before starting. In that case
setBinaryStream() is no better than setBytes()..

I could live with an interpretation that says "always store exactly
length bytes, but then read to EOF if there are extra bytes left over".
It would still be an error to supply less than 'length' bytes in the
stream; I think this is better than padding with 0 bytes or anything
similar (by the time the driver sees EOF, it is committed at the
protocol level to writing a parameter of length 'length', so it can't
just stop at EOF).

But I don't know if that's a better interpretation..

-O

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Mikko T. 2004-08-20 09:23:18 Re: OutOfMemory when inserting stream of unknown length
Previous Message Kris Jurka 2004-08-19 18:42:39 Re: OutOfMemory when inserting stream of unknown length