Re: large object seek/write bug

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Ian Grant <Ian(dot)Grant(at)cl(dot)cam(dot)ac(dot)uk>
Cc: pgsql-bugs(at)hub(dot)org
Subject: Re: large object seek/write bug
Date: 2000-06-15 06:28:15
Message-ID: 18322.961050495@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Ian Grant <Ian(dot)Grant(at)cl(dot)cam(dot)ac(dot)uk> writes:
> I'm using V 7.0 on a Linux machine and I believe I have found a bug in the
> large object interface provided by libpq.

Thank you for the carefully developed test case. The bug is actually
in the backend, not in libpq: there's an ancient hack in inv_write
that looks at rel->rd_nblocks to decide if the large object is empty.
Unfortunately rd_nblocks isn't maintained very carefully, so the test
might mistakenly consider a recently-created LO to be empty, leading
to the Wrong Thing. But that hack is no longer necessary, since the
index bug it was trying to work around was swatted ages ago; diking
out the check suffices to fix it.

Another error I noticed while testing the fix is that inv_read doesn't
cope gracefully with "holes" in the large object, ie, ranges never
written because of a seek and write past the previous end of file.
With the attached patch, reads of "holes" reliably return zeroes,
as is considered standard behavior for Unix files. (The garbage data
you saw was actually from this error, although the case would not have
been triggered if not for the first bug.)

I believe the attached patch fixes these problems, but I haven't been
able to wring it out very thoroughly because I don't have applications
that do random seeks and writes in large objects. If you could bang
on it a little more and report back, that'd be great.

regards, tom lane

*** src/backend/storage/large_object/inv_api.c.orig Wed Apr 12 13:15:37 2000
--- src/backend/storage/large_object/inv_api.c Thu Jun 15 02:10:27 2000
***************
*** 185,190 ****
--- 185,191 ----
retval->idesc = RelationGetDescr(indr);
retval->offset = retval->lowbyte = retval->highbyte = 0;
ItemPointerSetInvalid(&(retval->htid));
+ retval->flags = 0;

if (flags & INV_WRITE)
{
***************
*** 233,238 ****
--- 234,240 ----
retval->idesc = RelationGetDescr(indrel);
retval->offset = retval->lowbyte = retval->highbyte = 0;
ItemPointerSetInvalid(&(retval->htid));
+ retval->flags = 0;

if (flags & INV_WRITE)
{
***************
*** 371,384 ****
if (whence == SEEK_CUR)
{
offset += obj_desc->offset; /* calculate absolute position */
- return inv_seek(obj_desc, offset, SEEK_SET);
}
!
! /*
! * if you seek past the end (offset > 0) I have no clue what happens
! * :-( B.L. 9/1/93
! */
! if (whence == SEEK_END)
{
/* need read lock for getsize */
if (!(obj_desc->flags & IFS_RDLOCK))
--- 373,380 ----
if (whence == SEEK_CUR)
{
offset += obj_desc->offset; /* calculate absolute position */
}
! else if (whence == SEEK_END)
{
/* need read lock for getsize */
if (!(obj_desc->flags & IFS_RDLOCK))
***************
*** 389,396 ****
offset += _inv_getsize(obj_desc->heap_r,
obj_desc->hdesc,
obj_desc->index_r);
- return inv_seek(obj_desc, offset, SEEK_SET);
}

/*
* Whenever we do a seek, we turn off the EOF flag bit to force
--- 385,392 ----
offset += _inv_getsize(obj_desc->heap_r,
obj_desc->hdesc,
obj_desc->index_r);
}
+ /* now we can assume that the operation is SEEK_SET */

/*
* Whenever we do a seek, we turn off the EOF flag bit to force
***************
*** 414,422 ****
* stores the offset of the last byte that appears on it, and we have
* an index on this.
*/
-
-
- /* right now, just assume that the operation is SEEK_SET */
if (obj_desc->iscan != (IndexScanDesc) NULL)
{
d = Int32GetDatum(offset);
--- 410,415 ----
***************
*** 424,430 ****
}
else
{
-
ScanKeyEntryInitialize(&skey, 0x0, 1, F_INT4GE,
Int32GetDatum(offset));

--- 417,422 ----
***************
*** 487,495 ****

/* copy the data from this block into the buffer */
d = heap_getattr(&tuple, 2, obj_desc->hdesc, &isNull);
ReleaseBuffer(buffer);

! fsblock = (struct varlena *) DatumGetPointer(d);

off = obj_desc->offset - obj_desc->lowbyte;
ncopy = obj_desc->highbyte - obj_desc->offset + 1;
--- 479,505 ----

/* copy the data from this block into the buffer */
d = heap_getattr(&tuple, 2, obj_desc->hdesc, &isNull);
+ fsblock = (struct varlena *) DatumGetPointer(d);
ReleaseBuffer(buffer);

! /*
! * If block starts beyond current seek point, then we are looking
! * at a "hole" (unwritten area) in the object. Return zeroes for
! * the "hole".
! */
! if (obj_desc->offset < obj_desc->lowbyte)
! {
! int nzeroes = obj_desc->lowbyte - obj_desc->offset;
!
! if (nzeroes > (nbytes - nread))
! nzeroes = (nbytes - nread);
! MemSet(buf, 0, nzeroes);
! buf += nzeroes;
! nread += nzeroes;
! obj_desc->offset += nzeroes;
! if (nread >= nbytes)
! break;
! }

off = obj_desc->offset - obj_desc->lowbyte;
ncopy = obj_desc->highbyte - obj_desc->offset + 1;
***************
*** 535,548 ****
Buffer buffer;

/*
! * Fetch the current inversion file system block. If the class
! * storing the inversion file is empty, we don't want to do an
! * index lookup, since index lookups choke on empty files (should
! * be fixed someday).
*/

! if ((obj_desc->flags & IFS_ATEOF)
! || obj_desc->heap_r->rd_nblocks == 0)
tuple.t_data = NULL;
else
inv_fetchtup(obj_desc, &tuple, &buffer);
--- 545,555 ----
Buffer buffer;

/*
! * Fetch the current inversion file system block. We can skip
! * the work if we already know we are at EOF.
*/

! if (obj_desc->flags & IFS_ATEOF)
tuple.t_data = NULL;
else
inv_fetchtup(obj_desc, &tuple, &buffer);

In response to

  • at 2000-06-14 09:57:43 from Ian Grant

Browse pgsql-bugs by date

  From Date Subject
Next Message Ian Grant 2000-06-15 09:45:35 Re: large object seek/write bug
Previous Message Travis Bauer 2000-06-15 03:26:21 Re: jdbc2 bug in absolute (ResultSet.java)