Bug #643: spin lock aborts in 7.0.3

From: pgsql-bugs(at)postgresql(dot)org
To: pgsql-bugs(at)postgresql(dot)org
Subject: Bug #643: spin lock aborts in 7.0.3
Date: 2002-04-24 03:04:09
Message-ID: 20020424030409.7FEF9475884@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

John Maddalozzo (john(at)journyx(dot)com) reports a bug with a severity of 2
The lower the number the more severe it is.

Short Description
spin lock aborts in 7.0.3

Long Description
Running many backends on a 7.0.3 postgres server
RedHat 6.2
PostgreSQL 7.0.3 on i686-pc-linux-gnu, compiled by gcc egcs-2.91.66

Its been pretty reliable, but we have an inceasing frequency of spin lock aborts.

Apr 22 15:49:10 db01 postmaster: FATAL: s_lock(4001506c) at spin.c:111, stuck spinlock. Aborting.
Apr 22 15:49:11 db01 postmaster: FATAL: s_lock(4001506c) at spin.c:111, stuck spinlock. Aborting.

They always come in pairs like that.

Then grief...
Apr 22 15:49:11 db01 postmaster: Server process (pid 21833) exited with status 134 at Mon Apr 22 15:49:09 2002
Apr 22 15:49:11 db01 postmaster: Terminating any active server processes...

This is painful because the corporate website as well as some 500+ backends representing 210 DB instances go byby while postgres reestablishes itself.

I've seen some suggestions this is because an application terminates with a segsegv, leaving a lock. We see frequent messages in the log file like this
postmaster: pq_recvbuf: unexpected EOF on client connection
These occur much more frequently than the spinlock aborts, so there is not a 1:1 correspondence. The pq_recvbuf errors in the two I just looked at were 2+ and 4+ minutes prior to the next spinlock aborts. I believe the majority of these are command-line scripts opening connections and not propperly closing them. We're addressing that problem separately.

But the operations that seem to trigger the abort right now are DB schema creation, although there's another in there during the nightly vaccuum. I can pretty much re-create this, every second or third instance setup. I've done a lot of searching around and haven't found a definitive answer to what fixes this. I'm willing to upgrade, and willing to generate traces or whatever if someone can tell me will might lead to a resolution.

Sample Code

No file was uploaded with this report

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2002-04-24 04:50:43 Re: Bug #643: spin lock aborts in 7.0.3
Previous Message Tom Lane 2002-04-23 23:10:01 Re: 7.2.1: pg_dump of UNIONed VIEWs broken