Re: BLOBs

From: "Joe Shevland" <jshevland(at)j-elite(dot)com>
To: "Marin Dimitrov" <marin(dot)dimitrov(at)sirma(dot)bg>, <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: BLOBs
Date: 2002-03-21 04:04:21
Message-ID: HEECIHEEJDBMCCGMGIOBEEOKCFAA.jshevland@j-elite.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

> what is the recommended way to work with LOBs in Postgres 7.2?
>
> there are certain performance related issues with BYTEA if it is read as a
> whole and not by chunks

From what I can see in the code, bytea field data for the entire result set are sucked into memory when the query is executed, which is a potential performance hit, whereas with LOB's (using an oid as the field type), the input stream is opened only when the column is referenced. I hope I'm right in saying that.

> on the other hand storing LOBs as Large Objects and using a propriety API
> does not seem like a good idea to me (besides the docs mention more
> limitation of the Large Objects method)

Working with LOB's (using an oid column type) does use a proprietary API internally, but that's hidden from view when using the getBytes() and getBinaryStream() methods. I believe there's a flag that controls how the driver behaves with these methods (I think it defaults to looking for bytea columns if it detects a 7.2 backend, other it'll try the LOB API). With LOB's you need to worry about cleaning the unreferenced LOB data from the database however often it suits, as deleting the tuple won't delete the referenced data (there's a contrib/vacuumlo, not sure if its still maintained).

There's potentially security risks with LOB's too (not bytea); they are all stored in one rotund global table, and I think that any user that can access any database has access to the LOB data. Again, I hope I haven't got that wrong.

> so I wander - what method do u people use to store LOBs when working with
> JDBC applications? BYTEA is not quite the JDBC type LOB where u expect to
> work with LOB locators and not retrieve the whole data record (similar to
> Oracle's LONG), on the other hand working with Large Objects
> induces certain
> inefficiency (custom API, certain limitations)
>
> any advice?

I've moved to bytea for the current thing I'm working on, as that seems to be the preferred path, but haven't done any tests of performance yet vs LOB's.

Cheers,
Joe

(I wonder if there's some way we can implement 'lazy' loading of the bytea data? Like obtain the input stream/entire chunk from the backend only when the column is referenced. Happy to have a crack at this if it sounds feasible.)

In response to

  • BLOBs at 2002-03-20 15:24:47 from Marin Dimitrov

Browse pgsql-jdbc by date

  From Date Subject
Next Message Joe Shevland 2002-03-21 04:27:03 Re: more post & Tom (qu? ;)
Previous Message Bruce Momjian 2002-03-21 02:46:54 Re: [PATCHES] JDBC Connection startup cleaned up