Re: Spinlocks, yet again: analysis and proposed patches

From: Marko Kreen <marko(at)l-t(dot)ee>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Spinlocks, yet again: analysis and proposed patches
Date: 2005-09-13 07:31:19
Message-ID: 20050913073119.GA29529@l-t.ee
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Sep 11, 2005 at 05:59:49PM -0400, Tom Lane wrote:
> The second reason that the futex patch is helping is that when
> a spinlock delay does occur, it allows the delaying process to be
> awoken almost immediately, rather than delaying 10 msec or more
> as the existing code does. However, given that we are only expecting
> the spinlock to be held for a couple dozen instructions, using the
> kernel futex mechanism is huge overkill --- the in-kernel overhead
> to manage the futex state is almost certainly several orders of
> magnitude more than the delay we actually want.

Why do you think so? AFAIK on uncontented case there will be no
kernel access, only atomic inc/dec. On contented case you'll
want task switch anyway, so the futex managing should not
matter. Also this mechanism is specifically optimized for
inter-process locking, I don't think you can get more efficient
mechanism from side-effects from generic syscalls.

If you don't want Linux-specific locking in core code, then
it's another matter...

> 1. Use sched_yield() if available: it does just what we want,
> ie, yield the processor without forcing any useless time delay
> before we can be rescheduled. This doesn't exist everywhere
> but it exists in recent Linuxen, so I tried it. It made for a
> beautiful improvement in my test case numbers: CPU utilization
> went to 100% and the context swap rate to almost nil. Unfortunately,
> I also saw fairly frequent "stuck spinlock" panics when running
> more queries than there were processors --- this despite increasing
> NUM_DELAYS to 10000 in s_lock.c. So I don't trust sched_yield
> anymore. Whatever it's doing in Linux 2.6 isn't what you'd expect.
> (I speculate that it's set up to only yield the processor to other
> processes already affiliated to that processor. In any case, it
> is definitely capable of getting through 10000 yields without
> running the guy who's holding the spinlock.)

This is intended behaviour of sched_yield.

http://lwn.net/Articles/31462/
http://marc.theaimsgroup.com/?l=linux-kernel&m=112432727428224&w=2

--
marko

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paesold 2005-09-13 10:37:31 Re: Spinlocks, yet again: analysis and proposed patches
Previous Message Qingqing Zhou 2005-09-13 00:16:16 Re: counting disk access from index seek operation -- how to?