Re: Streaming bytea implementation offered

From: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
To: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
Cc: Werner Donné <werner(dot)donne(at)pincette(dot)biz>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Streaming bytea implementation offered
Date: 2012-09-01 02:11:44
Message-ID: 50416EE0.7070408@archidevsys.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

On 01/09/12 13:04, Craig Ringer wrote:
> On 09/01/2012 12:49 AM, Werner Donné wrote:
>> Hello,
>>
>> I have modified version 9.1-902 of the driver in order for it to
>> stream bytea values when calling ResultSet.getBinaryStream(). The
>> implementation is here:
>>
>> http://www.pincette.biz/home/werner/postgresql-jdbc-9.1-902-patched.src.zip
>>
>>
>> When a bytea value is larger than 1MB it is streamed to a temporary
>> file instead of being copied in the "answer" array. When the stream
>> is later accessed, either with "getBinaryInputStream" or "getBytes",
>> the bytes are fetched from the temporary file, which is subsequently
>> deleted. The "getBytes" method copies the bytes in the result array,
>> while "getBinaryInputStream" returns a FileInputStream with an
>> overridden "close" method that deletes the temporary file
>
> Sounds potentially good, but what happens if it's run inside the
> SecurityManager, or in an environment with no ability to create
> temporary files?
>
> What about out-of-disk causing failures that wouldn't occur with OOM?
>
> Do you propose to make this always-on, or configurable?
>
> How (if at all) does this interact with JDBC-standard BLOB support,
> and with large objects?
>
>
> If you'd like this change considered, it'd be great if you could
> create a patch against the current PgJDBC sources, rather than just
> publishing patched sources. I'd recommend that you clone PgJDBC
> (https://github.com/pgjdbc/pgjdbc):
>
> git://github.com/pgjdbc/pgjdbc.git
>
> then copy your changes into that working tree and use `git diff`.
> Alternately, you can , `git add` them, `git commit`, then use `git
> format-patch` to get a diff; see
>
> http://www.kernel.org/pub/software/scm/git/docs/git-format-patch.html
>
> Make sure not to include any generated files, and please add some unit
> tests demonstrating the functionality if possible.
>
> --
> Craig Ringer
>
>
Hi,

BACKGROUND: I am working on a project at the University of Auckland,
that will involve serving images over the web from JBoss backed by
PostgreSQL (probably 9.2, though 9.1 is sufficient) running on a Linux
box. We are talking transactions per minute rather than per second, but
the SQL queries are likely to be quite complicated and involve many
images -- though detailed database design is yet to be done. We will
probably have a reasonable amount of RAM

So in the future, I may have need of the functionality discussed.

At this stage, I think I would prefer to avoid work files as much as
possible, so having the _threshold for deciding when to use a temporary
file being configurable_ would be appreciated. Currently the test image
files are under 1 MB, but I expect larger images will need to be dealt with.

Cheers,
Gavin

In response to

Browse pgsql-jdbc by date

  From Date Subject
Next Message Werner Donné 2012-09-01 15:13:13 Re: Streaming bytea implementation offered
Previous Message Craig Ringer 2012-09-01 01:04:11 Re: Streaming bytea implementation offered