Re: [7.0.2] problems with spinlock under FreeBSD?

From: The Hermit Hacker <scrappy(at)hub(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [7.0.2] problems with spinlock under FreeBSD?
Date: 2000-08-24 16:44:38
Message-ID: Pine.BSF.4.21.0008241328390.801-100000@thelab.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 24 Aug 2000, Tom Lane wrote:

> The Hermit Hacker <scrappy(at)hub(dot)org> writes:
> >> What do you get from gdb backtraces on the corefiles?
>
> > #2 0x80ee847 in s_lock_stuck (lock=0x20048065 "\001", file=0x816723c "spin.c", line=127) at s_lock.c:51
> > #3 0x80ee8c3 in s_lock (lock=0x20048065 "\001", file=0x816723c "spin.c", line=127) at s_lock.c:80
> > #4 0x80f1580 in SpinAcquire (lockid=7) at spin.c:127
> > #5 0x80f3903 in LockRelease (lockmethod=1, locktag=0xbfbfe674, lockmode=1) at lock.c:1044
> > #6 0x80f2af9 in UnlockRelation (relation=0x82063f0, lockmode=1) at lmgr.c:178
> > #7 0x806f25e in index_endscan (scan=0x8208780) at indexam.c:284
>
> That's interesting ... someone failing to release lock.c's master
> spinlock, it looks like. Do you have anything in the postmaster log
> from just before the crashes?

okay, nothing that I can see that is 'unusual' in the log files, but as
shown below, at ~10:30am today, the same thing appears to have happened
...

%ls -lt */*.core
-rw------- 1 pgsql pgsql 22589440 Aug 23 10:39 udmsearch/postgres.core
-rw------- 1 pgsql pgsql 5505024 Aug 23 10:34 rockwell/postgres.core
-rw------- 1 pgsql pgsql 5099520 Aug 23 10:33 pg_banners/postgres.core
-rw------- 1 pgsql pgsql 5009408 Aug 23 10:32 hub_traf_stats/postgres.core
-rw------- 1 pgsql pgsql 5099520 Aug 23 10:32 trends_acctng/postgres.core
-rw------- 1 pgsql pgsql 5124096 Aug 23 10:32 area902/postgres.core
-rw------- 1 pgsql pgsql 5074944 Aug 23 10:32 petpostings/postgres.core
-rw------- 1 pgsql pgsql 5074944 Aug 23 10:32 counter/postgres.core
-rw------- 1 pgsql pgsql 10567680 Aug 23 09:56 horde/postgres.core

Check the gdb on a couple of them:

(gdb) where
#0 0x18271d90 in kill () from /usr/lib/libc.so.4
#1 0x182b2e09 in abort () from /usr/lib/libc.so.4
#2 0x80ee847 in s_lock_stuck (lock=0x20048065 "\001", file=0x816723c "spin.c", line=127) at s_lock.c:51
#3 0x80ee8c3 in s_lock (lock=0x20048065 "\001", file=0x816723c "spin.c", line=127) at s_lock.c:80
#4 0x80f1580 in SpinAcquire (lockid=7) at spin.c:127

(gdb) where
#0 0x18271d90 in kill () from /usr/lib/libc.so.4
#1 0x182b2e09 in abort () from /usr/lib/libc.so.4
#2 0x80ee847 in s_lock_stuck (lock=0x20048065 "\001", file=0x816723c "spin.c", line=127) at s_lock.c:51
#3 0x80ee8c3 in s_lock (lock=0x20048065 "\001", file=0x816723c "spin.c", line=127) at s_lock.c:80
#4 0x80f1580 in SpinAcquire (lockid=7) at spin.c:127

they all appear to be in the same place ...

now, I'm running 4 seperate postmaster daemons, with seperate data
directories, as:

ps ux | grep postmaster | grep 543
pgsql 50554 0.0 0.1 6904 556 p0- I 1:12PM 0:04.88 /pgsql/bin/postmaster -D/pgsql/special/sales.org -i -p 5434 (postgres)
pgsql 61821 0.0 0.1 7080 636 p6- S 4:38PM 3:03.86 /pgsql/bin/postmaster -B 256 -N 128 -o -F -o /pgsql/logs/5432.61820 -S 32768 -i -p 5432 -D/pgsql/data (postgres)
pgsql 62268 0.0 0.0 5488 0 p4- IW - 0:00.00 /pgsql/bin/postmaster -d 1 -N 16 -o -F -o /pgsql/logs/5433.62267 -S 32768 -i -p 5433 -D/pgsql/special/lee (postgres)
pgsql 27084 0.0 0.1 5496 596 p4- S 8:25AM 0:54.11 /pgsql/bin/postmaster -d 1 -N 16 -o -F -o /pgsql/logs/5437.27083 -S 32768 -i -p 5437 -D/pgsql/special/mukesh (postgres)

and the above core files are from the one running on 5432 ...

you still have your account on that machine if you want to take a quick
look around ... else, anything else I should be looking at?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Don Baccus 2000-08-24 17:37:41 Re: Mainframe access
Previous Message Tim Perdue 2000-08-24 15:43:25 Re: Interesting new bug?