Re: Workarounds for getBinaryStream returning ByteArrayInputStream on bytea

From: Radosław Smogura <rsmogura(at)softperience(dot)eu>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: Александър Шопов <lists(at)kambanaria(dot)org>, <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Workarounds for getBinaryStream returning ByteArrayInputStream on bytea
Date: 2010-11-26 16:20:47
Message-ID: 147c80a20962677cdf18c6c4a0232786@smogura-softworks.eu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-jdbc

On Fri, 26 Nov 2010 10:25:01 -0500 (EST), Kris Jurka <books(at)ejurka(dot)com>
wrote:
> On Fri, 26 Nov 2010, Rados?aw Smogura wrote:
>
>> I would like to send few files for getBinaryStream(). So this will work
>> much like stream and will don't eat so much heap. I don't copy source
>> this_row[i] array, so I don't know how this will do with concur
updates,
>> (original method doesn't make this when column is not bytea, too). I
left
>> few comments if we should throw exception on broken streams in 8.4, or
>> just
>> silence notify EOF.
>
> The problem is that the whole bytea is still in this_row[i]. The value
> isn't being streamed from the server. So yes, you are saving a copy of
> the value which does save heap space, but that won't really help the
> described problem where many large bytea values are fetched because the
> driver will have read and stored them all prior to getBinaryStream being

> called.
>
> Kris Jurka

Yes indeed it will don't give you "big" heap save, but driver calls in
getBinaryStream() getBytes(), then PGBytea... method. This method
transforms source, text based, array into pure binary array, so it creates
some kind of copy of source, generally smeller (this copy will not be
smaller then source divided by 4). So, when Aleksander compress 1GB files,
I assume he use stream compression, he allocates in addition about
500-800MB on heap for this transformed array, but he doesn't needs it so
big at one time, as compression block isn't larger then 1MB.

It is the way why submitted streams performs "on-line" conversion.
--
----------
Radosław Smogura
http://www.softperience.eu

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2010-11-26 16:53:23 Re: duplicate connection failure messages
Previous Message Tom Lane 2010-11-26 16:16:32 Re: SQL/MED - core functionality

Browse pgsql-jdbc by date

  From Date Subject
Next Message Radosław Smogura 2010-11-26 18:28:52 Storing timestamps in text format
Previous Message Kris Jurka 2010-11-26 15:25:01 Re: Workarounds for getBinaryStream returning ByteArrayInputStream on bytea