Re: Apparent deadlock 7.0.1

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Simms <grim(at)ewtoo(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Apparent deadlock 7.0.1
Date: 2000-06-08 02:43:50
Message-ID: 14084.960432230@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Michael Simms <grim(at)ewtoo(dot)org> writes:
>>>> I have noticed a deadlock happening on 7.0.1 on updates.
>>>> The backends just lock, and take up as much CPU as they can. I kill
>>>> the postmaster, and the backends stay alive, using CPU at the highest
>>>> rate possible. The operations arent that expensive, just a single line
>>>> of update.
>>>> Anyone else seen this? Anyone dealing with this?
>>
>> News to me. What sort of hardware are you running on? It sort of
>> sounds like the spinlock code not working as it should --- and since
>> spinlocks are done with platform-dependent assembler, it matters...

> The hardware/software is:

> Linux kernel 2.2.15 (SMP kernel)
> Glibc 2.1.1
> Dual Intel PIII/500

Dual CPUs huh? I have heard of motherboards that have (misdesigned)
memory caching such that the two CPUs don't reliably see each others'
updates to a shared memory location. Naturally that plays hell with the
spinlock code :-(. It might be necessary to insert some kind of cache-
flushing instruction into the spinlock wait loop to ensure that the
CPUs see each others' changes to the lock.

This is all theory at this point, and a hole in the theory is that the
backends ought to give up with a "stuck spinlock" error after a minute
or two of not being able to grab the lock. I assume you have left them
go at it for longer than that without seeing such an error?

Anyway, the next step is to "kill -ABORT" some of the stuck processes
and get backtraces from their coredumps to see where they are stuck.
If you find they are inside s_lock() then it's definitely some kind of
spinlock problem. If not...

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2000-06-08 02:55:43 Re: column aliases
Previous Message Ed Loehr 2000-06-08 02:37:14 [GENERAL] NOTIFY/LISTEN in pgsql 7.0