Re: sinval synchronization considered harmful

From: Florian Pflug <fgp(at)phlo(dot)org>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: sinval synchronization considered harmful
Date: 2011-07-21 22:22:09
Message-ID: 79FCA27A-B912-48B4-90A4-562FEFB1EE75@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Jul21, 2011, at 21:15 , Robert Haas wrote:
> On Thu, Jul 21, 2011 at 2:50 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> ... On these machines, you need to issue an explicit memory barrier
>>> instruction at each sequence point, or just acquire and release a
>>> spinlock.
>>
>> Right, and the reason that a spinlock fixes it is that we have memory
>> barrier instructions built into the spinlock code sequences on machines
>> where it matters.
>>
>> To get to the point where we could do the sort of optimization Robert
>> is talking about, someone will have to build suitable primitives for
>> all the platforms we support. In the cases where we use gcc ASM in
>> s_lock.h, it shouldn't be too hard to pull out the barrier
>> instruction(s) ... but on platforms where we rely on OS-supplied
>> functions, some research is going to be needed.
>
> Yeah, although falling back to SpinLockAcquire() and SpinLockRelease()
> on a backend-private slock_t should work anywhere that PostgreSQL
> works at all[1]. That will probably be slower than a memory fence
> instruction and certainly slower than a compiler barrier, but the
> point is that - right now - we're doing it the slow way everywhere.

As I discovered while playing with various lockless algorithms to
improve our LWLocks, spin locks aren't actually a replacement for
a (full) barrier.

Lock acquisition only really needs to guarantee that loads and stores
which come after the acquisition operation in program order (i.e., in
the instruction stream) aren't globally visible before that operation
completes. This kind of barrier behaviour is often fittingly called
"acquire barrier".

Similarly, a lock release operation only needs to guarantee that loads
and stores which occur before that operation in program order are
globally visible before the release operation completes. This, again,
is fittingly called "release barrier".

Now assume the following code fragment

global1 = 1;
SpinLockAcquire();
SpinLockRelease();
global2 = 1;

If SpinLockAcquire() has "acquire barrier" semantics, and SpinLockRelease()
has "release barrier" sematics, the it's possible for the store to global1
to be delayed until after SpinLockAcquire(), and similarly for the store
to global2 to be executed before SpinLockRelease() completes. In other
words, what happens is

SpinLockAcquire();
global1 = 1;
global2 = 1;
SpinLockRelease();

But once that can happens, there's no reason that it couldn't also be

SpinLockAcquire();
global2 = 1;
global1 = 1;
SpinLockRelease();

I didn't check if any of our spin lock implementations is actually affected
by this, but it doesn't seem wise to rely on them being full barriers, even
if it may be true today.

best regards,
Florian Pflug

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christopher Browne 2011-07-21 22:30:48 Re: storing TZ along timestamps
Previous Message Robert Haas 2011-07-21 22:17:28 Re: sinval synchronization considered harmful