Large objects and locking mechanism

From: Alessandro Baldoni <abaldoni(at)csr(dot)unibo(dot)it>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Large objects and locking mechanism
Date: 1998-05-25 07:51:59
Message-ID: 3569231F.5AB2C94C@csr.unibo.it
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This message raises the doubt of a possible PostgreSQL bug connected
with
large objects and the locking mechanism.

1) The problem

I'm currently experiencing the followin problem.
I need to store in a PostgreSQL database a large amount of double
precision
numbers (they are wavelets coefficients, if you know what they are).
Since they are more than 8kb, I store them as a large object of about
46kb.
I've also written a set of functions that operate on them.
One of this functions is the following:

float8 *
get_stddev (Oid wdata, int4 elem)
{
float8 *result;

result = (float8 *) palloc (sizeof (float8));
if ((fd = lo_open (wdata, INV_READ)) == -1)
elog (ERROR, "wav_dist: Cannot access wavelet data");

<some `lo_read's>

lo_close (fd);
return result;
}

Once registered in the database, I call it as

SELECT DISTINCT get_stddev (fieldname, 1) FROM tablename;

Of course, there are also more complicated functions.

When the number of records in the db is around 300 (and above), I get
the
following messages:

NOTICE: LockReleaseAll: cannot remove lock from HTAB
NOTICE: LockRelease: find xid, table corrupted

NOTICE: LockRelease: find xid, table corrupted

NOTICE: LockRelease: find xid, table corrupted

FATAL: unrecognized data from the backend. It probably dumped core.
FATAL: unrecognized data from the backend. It probably dumped core.

Please note that the first run of the query gives the expected results
(sometimes).

If I run

gdb postgres core

and type where, I get

#0 0x8100bd9 in hash_search ()
#1 0x8100aec in hash_search ()
#2 0x80d4a75 in LockAcquire ()
#3 0x80d6538 in SingleLockPage ()
#4 0x80d4486 in RelationSetSingleRLockPage ()
#5 0x8070b6a in _bt_pagedel ()
#6 0x80708c0 in _bt_getbuf ()
#7 0x80703d9 in _bt_getroot ()
#8 0x8072105 in _bt_first ()
#9 0x8070fef in btgettuple ()
#10 0x8100414 in fmgr_c ()
#11 0x810071b in fmgr ()
#12 0x806b608 in index_getnext ()
#13 0x80d36c3 in inv_read ()
#14 0x80d354a in inv_read ()
#15 0x8098d09 in lo_read ()
#16 0x40230856 in ?? () from <<<this is my shared library>>>
<other frames follow>

1.1) Further analisys

To further study this problem, I've created the following table:

CREATE TABLE foo (fii oid);

and added it

INSERT INTO foo VALUES (lo_import ('/tmp/f'));

300 times. /tmp/f is a sample file of 46116 bytes.
The problem continues to arise.
I also noted that, using a code that does the following:

for each tuple
open connection
lo_export
close connection
<something on the exported file>

all goes well.
Otherwise, the following

open connection
for each tuple
lo_export
<something on the exported file>
close connection

fails around the same tuple.
Once, using dmesg, I found the message

VFS: file-max limit 1024 reached

but only once.

-->> Everything seems connected with the locking mechanism.
-->> If I run the postmaster with -o -L, everything (but not all) works.

I usually run postmaster with the -F flag. I tried to disable it, but
PostgreSQL continues to fail.
I'm running a Linux box (i586 120Mh) with kernel 2.1.65 ELF,
PostgreSQL 6.3.2 compiled with GCC 2.8.1, 64 Mb RAM.

Thanks for any help or suggestion

Alessandro Baldoni
abaldoni(at)csr(dot)unibo(dot)it
http://www.csr.unibo.it/~abaldoni

Browse pgsql-hackers by date

  From Date Subject
Next Message David Gould 1998-05-25 08:25:02 Re: [HACKERS] Current sources?
Previous Message Tom Ivar Helbekkmo 1998-05-25 05:30:35 Re: [HACKERS] Query cancel and OOB data