Re: Reading and writing off-heap data

From: Dave Cramer <pg(at)fastcrypt(dot)com>
To: Tom Dunstan <pgsql(at)tomd(dot)cc>
Cc: "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Reading and writing off-heap data
Date: 2017-09-21 15:31:23
Message-ID: CADK3HH+ZptDm7LSizBrtWSZjjKcC_frx7Z=RfcYQOpP23noVAg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

On 20 September 2017 at 22:43, Tom Dunstan <pgsql(at)tomd(dot)cc> wrote:

> Hi all
>
> After original discussion back in March[1] we've finally gotten around to
> scheduling this work. I've submitted a pull-request to support writing data
> from off-heap locations using the discussed interface here [2].
>
> We'd like to do the flip of this too, though: read incoming data into a
> caller-controlled buffer in some way. The interface would look something
> like this:
>
> // provided by driver
> interface ByteStreamReader<T implements Closeable> {
> T readByteStream(int length, InputStream stream) throws IOException;
> }
>
> // user code
> class MyCustomByteStreamReader implements ByteStreamReader<MyBufferHandle>
> {
> ...
> }
>
> preparedStatement.registerByteStreamReader(new
> MyCustomByteStreamReader());
> ...
> MyBufferHandle b = (MyBufferHandle) resultSet.getObject(2);
>
>
> There are a couple of issues:
>
> 1. Internally, the driver passes incoming tuples around as byte[][]
> instances, which doesn't leave much ability to do something else with the
> incoming data. I've submitted a PR [3] that introduces a Tuple class as a
> wrapper to pass around, which then allows us to do more interesting things
> with the data.
>
> 2. How should we register the reader? We have to do it ahead of execution
> of the query, as the driver has already read at least some data rows by the
> time we return the ResultSet.
> Some potential options are:
>
> a) Register against the statement and use it for all columns of binary
> type. This would look like the above.
>
> b) Register against the statement but for individual columns:
> statement.registerByteStreamReader(2, new MyCustomByteStreamReader());
> statement.registerByteStreamReader("foo", new MyCustomByteStreamReader());
>
> c) Mark incoming columns in some other way that the driver can recognise.
> This requires getting creative. An example would be to create a domain over
> the bytea type and then register the reader for that type. Then queries
> would have to have results cast to that type.
>
> def don't like C

> d) Register a higher-level object like the connection or driver and use it
> for all columns of binary type.
>
> Option a) is the simplest in that it neither requires us to keep track of
> readers for individual columns nor requires users having to mess with their
> database schema. It's definitely enough for my use-case, but I'm interested
> in hearing other opinions on whether that's flexible enough.
>
> I think I prefer this as it is simplest

>
> Is there general support for the feature? I'm again happy to code up a PR
> and have time allocated to do that fairly soon if there's likelihood of it
> being merged.
>
> I would think so , however as noted on the PR I'd like to see some timing
of the Tuple code. I don't expect it would be heinous but one never knows.

Dave Cramer

davec(at)postgresintl(dot)com
www.postgresintl.com

>

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Yason TR 2017-09-21 15:31:43 Re: JDBC: logical replication and LSN feedback
Previous Message Dave Cramer 2017-09-21 15:28:20 Re: JDBC: logical replication and LSN feedback