Re: Large object insert/update and oid use

From: Louis LeBlanc <db(at)keyslapper(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Louis LeBlanc <db(at)keyslapper(dot)org>, pgsql-general(at)postgresql(dot)org
Subject: Re: Large object insert/update and oid use
Date: 2004-01-31 19:54:35
Message-ID: 20040131195433.GB78400@keyslapper.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 01/31/04 02:00 PM, Tom Lane sat at the `puter and typed:
> Louis LeBlanc <db(at)keyslapper(dot)org> writes:
> > [ large objects don't work real well in That Other Database ]
>
> Fascinating. I'd have thought they'd use a less flaky design.

No doubt. I haven't had the chance to look at 9i or 10g, but 8.1.7 is
pretty bogus in that respect.

> > I've been looking at the Postgres docs, and I've found the lo_*()
> > routines, which appear to use OIDs to write to the object. Is there
> > any 'shifting' of records during insert because of space availability?
>
> No, OIDs are stable identifiers. It sounds like their rowid equates
> to what we call a TID or CTID --- that is, a tuple's current physical
> location. CTID isn't fixed across updates in Postgres, but OID is.

Nice. Good to know.

> > And, probably more important, is there a way to do a direct update of
> > a large object along with the other fields in a record - without
> > having to call the lo_creat() and lo_write() functions?
>
> Well, I guess the question is how large are your BLOBs and what are you
> doing with them exactly?

Well, we have a couple that are in the 650M range (CD ISOs), but the
average size is around 4K. They are retrieved from the database via a
server process, and inserted via a client process. Can't go into too
much more, but basically, they get stored in the core and kept while
they are useful (rollbacks for the table in question are unnecessary,
and O won't let you turn them off either - can you turn them off in
Postgres?).

> Postgres offers two different ways of dealing with large chunks of data.
> One is just to store them as big text or bytea values (which data type
> you use depends on whether the data is textual or not). This works well
> up to maybe a few megabytes, but it gets unwieldy for larger values,
> mainly because there's no easy way to store sections of a value. In
> some cases you can use substr() to fetch sections of a large value,
> but there's no equivalent means of writing just part of it. Also, if
> the data isn't textual then you have to deal with messy quoting rules
> when constructing SQL commands. (But as of 7.4 it's possible to
> alleviate the quoting problem by using out-of-line parameters instead.)

Sometimes the data is textual, but sometimes it isn't. It can be an
iso image, a graphic image file, an executable, literally anything.
Any datatype that requires escaping certain characters would be
impractical because of the potential size variation. Fetching part or
all of the object is important so that large objects can be retrieved
in parts, like a continued ftp transaction would do. Also, the
objects are usually written piecemeal when they exceed the 10M mark.
I suppose this could be done with an import though . . .

> The other way is to use lo_creat() and related functions. In this case
> each large object has its own identity (an OID assigned by lo_creat) and
> what you put in your table row is just a reference to the object. The
> plus side is that you can read and write a large object in sections
> (using lo_read/lo_write/lo_seek), the downside is that you have a more
> complicated model of what the database structure is, and lots more
> management headaches. For instance you need to explicitly delete a
> large object when you don't need it any more --- deleting a table row
> doesn't in itself make referenced large objects go away.

This still sounds like our current model except for the fact that the
OID has to be explicitly deleted. I assume a trigger could be set up
to perform this on the fly though?

> Personally I'd avoid the large object facility if there were any chance
> of dealing with the data as ordinary wide values instead. It's just
> easier.

Agreed, but probably not feasable when you consider that some of these
objects could eventually reach the Gigabyte threshold.

Thanks for the feedback.
Lou
--
Louis LeBlanc leblanc(at)keyslapper(dot)org
Fully Funded Hobbyist, KeySlapper Extrordinaire :)
http://www.keyslapper.org Ô¿Ô¬

Water, taken in moderation cannot hurt anybody.
-- Mark Twain

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ben 2004-01-31 20:11:19 Re: Are there commands to enquire about table structure?
Previous Message Kris Jurka 2004-01-31 19:44:42 Re: Unicode vs SQL_ASCII DBs