Re: Large object insert/update and oid use

From: Louis LeBlanc <db(at)keyslapper(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Large object insert/update and oid use
Date: 2004-01-31 19:41:15
Message-ID: 20040131194109.GA78400@keyslapper.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 01/31/04 06:57 PM, John Sidney-Woollett sat at the `puter and typed:
> Apologies if I missed something, but why are you using the rowid in
> Oracle instead of your own primary key value (even if you are
> inserting 10's of millions of records a day). A number(38) provides
> a range of magnitude 1E-130 .. 10E125

Ah, good question. Mainly because I didn't write the initial
implementation. As I understand it, my predecessor was under the
impression that rowid use was much faster than using the records
existing key, which is a varchar(2048). Now that I've changed the
code to use the key rather than the rowid, it's not any slower
(faster, in fact, though that may be attributable to any number of
other things . . .). On top of that, the key is pretty much
guaranteed to be unique.

> Using a non rowid PK value would mean that you were not at the mercy of
> the database moving your data record elsewhere if it cannot be accomodated
> in the original block following an update.

Exactly. Which is why we don't use rowids at all anymore.

> If you use a number(38) PK value instead of a row ID, you are *sure* that
> your record is accessible regardless of updates.

We thought of that, O is able to generate them through triggers, and
they should be uniqueue enough, but probably not worth the effort
since the tables pk isn't any slower.

> Of course if you wanted to avoid building/maintaining the number(38) PK
> index then that is a different matter...

That isn't so much the problem as trying to stop the swapped LOBs.
Ideally, the LOB would be included in the update when all the data is
collected. Here's how it goes:

Get a key (varchar(2048)) and create a placeholder record to prevent
other threads wasting effort on it (this is autocommitted).
Collect all the data for that key, including the object for the LOB.
Open an atomic transaction
Update the record with an empty LOB and the other data
Select the LOB locator using the pk
Write the object to the LOB locator
Close the atomic transaction, which commits.

The first step of this three step transaction used to return the
rowid, but from time to time it would also result in a shift of the
record to another rowid without notifying the app. The second step
would use that rowid rather than the pk to write the object in.

Ideally, the three step atomic transaction would be reduced to a
single step transaction. This is possible in O, but 8.1.7 has a bug
that causes temp tablespace to be eaten up and not freed.

That's what I'm trying to do in Postgres.

So, can it be done?

Thanks

Lou

> Louis LeBlanc said:
> > Hey folks. I'm new to the list, and not quite what you'd call a DB
> > Guru, so please be patient with me. I'm afraid the lead up here is a
> > bit verbose . . .
> >
> > I am working on an application that uses very high volume DB
> > transactions - in the order of tens of millions per day . . .
> >
> > Anyway, the current database which will remain nameless, but begins
> > with O and rymes with debacle (sorta), has a problem with high volume
> > work when it comes to binary large objects and rowid use (or oid use
> > as I understand Postgres uses).
> >
> > Here's the problem:
> >
> > When a record containing a number of basic types (int, varchar(2048),
> > char(32), etc.) as well as a large object is updated with a LOB, we
> > used to use rowids returned in the update of the other fields. The
> > rowid was then the condition used when selecting the locator for the
> > large object.
> >
> > Unfortunately, when a rowid is returned, it is always the location of
> > the current location, but if the data in question won't fit in the
> > current location, it is shifted to another rowid - but the application
> > can't really get notification of this. Later, when the object is
> > inserted using that rowid, it gets put in the wrong place. The result
> > is that sometimes, a request for one object will yield the wrong
> > object.
> >
> > Of course, there are ways to simply update the whole record, object
> > and all, but there are bugs in the database version we are using that
> > causes the temp tablespace to fill up and not be reused when this
> > method is used, so we've had to change some of this stuff around to
> > use the key field as the condition, and selecting the locator after
> > the initial creation of the empty object. The point is that the
> > direct updates of large objects have problems, as do the use of rowids
> > in high volume conditions. Regardless, we've had to implement some
> > verification checks to ensure the objects don't get swapped - this is
> > done by prefixing the object with a 32 bit MD5 hash that is also
> > stored in a separate field of the same record.
> >
> > I've been looking at the Postgres docs, and I've found the lo_*()
> > routines, which appear to use OIDs to write to the object. Is there
> > any 'shifting' of records during insert because of space availability?
> > And, probably more important, is there a way to do a direct update of
> > a large object along with the other fields in a record - without
> > having to call the lo_creat() and lo_write() functions? I've done
> > some searching in the archives, but can't find anything that appears
> > to answer the question.
> >
> > If I've been unclear on any of this, I'm sorry, feel free to prod for
> > more info.
> >
> > Thanks in advance
> >
> > Lou
> > --
> > Louis LeBlanc leblanc(at)keyslapper(dot)org
> > Fully Funded Hobbyist, KeySlapper Extrordinaire :)
> > http://www.keyslapper.org Ô¿Ô¬
> >
> > Scientists are people who build the Brooklyn Bridge and then buy it.
> > -- William Buckley
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> > message can get through to the mailing list cleanly
> >
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>
>

--
Louis LeBlanc leblanc(at)keyslapper(dot)org
Fully Funded Hobbyist, KeySlapper Extrordinaire :)
http://www.keyslapper.org Ô¿Ô¬

Federal grants are offered for... research into the recreation
potential of interplanetary space travel for the culturally disadvantaged.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Kris Jurka 2004-01-31 19:44:42 Re: Unicode vs SQL_ASCII DBs
Previous Message Anony Mous 2004-01-31 19:18:40 Re: Two joins on same foreign key