Re: Interest in allowing caller to push binary data rather than having it pulled?

From: Álvaro Hernández Tortosa <aht(at)8kdata(dot)com>
To: Tom Dunstan <pgsql(at)tomd(dot)cc>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Interest in allowing caller to push binary data rather than having it pulled?
Date: 2017-03-27 21:46:07
Message-ID: c84cfb7e-668a-a26b-b5f3-97eea643db99@8kdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

On 23/03/17 04:24, Tom Dunstan wrote:
> Hi all
>
> I hit an interesting case today. It’s a bit of a limitation in the JDBC interface, so any support would have to be a proprietary interface.
>
> Basically I have one or more byte buffers that I’d like to stream into a BYTEA at the server (using a plain INSERT statement). In my case I’ve got Netty ByteBuf objects, but it could be anything.
>
> What are my current options? JDBC basically gives me PreparedStatement.setBytes() and PreparedStatement.setBinaryStream().
>
> PreparedStatement.setBytes() involves copying all the data, potentially multiple large buffers, into a large buffer of exactly the correct size. The reason to use ByteBufs in the first place was to pool our use of large buffers so that we don’t blow out our heap - this completely kills any hope of that.
>
> PreparedStatement.setBinaryStream() is more flexible, but under the hood we’re just pulling stuff into an intermediary 8k buffer and then writing it out to the socket. This is OK from a heap management perspective, but still has some unnecessary copying.
>
> What I’d really like to do would be to provide an object that the driver could interrogate for a length and then provide an OutputStream to write to. The interface would look something like:
>
> interface ByteStreamWriter {
> int getLength();
> void writeTo(OutputStream stream);
> }
>
> The provided output stream would be a very thin wrapper around the socket output stream just ensuring that we don’t write too many bytes out.
>
> Usage would look thusly:
>
> myPreparedStatement.setObject(n, new MyByteStreamWriter(myByteBuf), Types.VARBINARY);
>
> And the user could write whatever adapter they wanted around their data.
>
> There’s an existing StreamWrapper class in the codebase, but it just provides an InputStream when asked. It could be adjusted to use the above interface for consistency though.
>
> Thoughts? I’d be happy to code up a PR if there’s interest.
>
> Cheers
>
> Tom
>
>
>
>

Hi Tom.

I think this is quite a good approach. I've seen significant
overheads in heap object creation in the process of
serialization/deserialization. Some of them were documented as part of
the slides of this talk:
https://www.slideshare.net/8kdata/java-and-postgresql-performance-features-and-the-future
(starting slide #19).

So having an adapter to write the data to the socket without
rewriting the bytes is a very desirable goal, in my opinion.

Cheers,

Álvaro

--

Álvaro Hernández Tortosa

-----------
<8K>data

In response to

Browse pgsql-jdbc by date

  From Date Subject
Next Message Dave Cramer 2017-03-30 15:22:15 ? In jsonpath syntax in sql-2016
Previous Message Daniel Migowski 2017-03-27 18:09:32 Re: cannot install JDBC with ORACLE jdk1.8.0