Re: bytea size limit?

From: Michael Privat <michael(at)ceci(dot)mit(dot)edu>
To: Oliver Jowett <oliver(at)opencloud(dot)com>
Cc: Kris Jurka <books(at)ejurka(dot)com>, "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: bytea size limit?
Date: 2004-04-13 03:34:18
Message-ID: 1433211018.20040412233418@ceci.mit.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

I don't think it will create more frequent GCs (the VM GC works by
allocated size not object count). Either way, it's a mute point if we
assume UTF-8. In this case, it is possible to forecast the size of
that byte array in advance and we don't need to mess around with
encoding just for sizing. Also the sendByteA() method can be optimized
to encode A LOT faster and more efficiently, directly at the byte
level instead of using the Encoder, pushing the data directly into the
stream.

But anyway, I didn't mean to start a whole debate on this. Just
thought I'd share the code. The driver does handle a lot bigger
binary objects than before and we all agree that, with the UTF-8
assumption, there are good optimizations that can be made that had not
been noticed before.

Monday, April 12, 2004, 10:49:18 PM, you wrote:

OJ> Michael Privat wrote:
>> With regards to individually escaping, it's just as expensive as how
>> it was before, except that it doesn't make several copies of the array
>> like it used to (hence the memory saving). I don't think there is any
>> performance impact at all. Basically just a memory gain.

OJ> Doesn't it end up creating an individual String (and backing char[]) for
OJ> each source byte? While these objects are very short-lived, they are
OJ> extra garbage and will cause more frequent GCs.

>> As far as the encoding. I think in your original email you had
>> mentioned that the driver used UTF-8 (in which case there is an
>> obvious optimization that can be made), but I couldn't find it in the
>> driver. Everything looked like it was inheriting from the encoding
>> scheme set in the connection.

OJ> Everything does indeed use the encoding object on the connection. The
OJ> trick is that if you look at the V3 connection setup code, the encoding
OJ> is always set to UNICODE which maps to Java's UTF8.

OJ> V2 connections can have a variety of encodings. The driver sets the
OJ> encoding to UNICODE for server versions >= 7.3, but uses the database's
OJ> encoding for earlier versions. But we don't need to know the encoded
OJ> length in advance when talking V2 anyway..

OJ> -O

OJ> ---------------------------(end of
OJ> broadcast)---------------------------
OJ> TIP 2: you can get off all lists at once with the unregister command
OJ> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)

In response to

Browse pgsql-jdbc by date

  From Date Subject
Next Message James Robinson 2004-04-13 15:23:48 Re: Under what circumstances does PreparedStatement use stored plans?
Previous Message Oliver Jowett 2004-04-13 02:49:18 Re: bytea size limit?