Re: Re: [PATCH] Use CC atomic builtins if available [was: Re: TAS patch for building on armel/armhf thumb]

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Re: [PATCH] Use CC atomic builtins if available [was: Re: TAS patch for building on armel/armhf thumb]
Date: 2012-02-03 01:48:21
Message-ID: 20120203014821.GA11939@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


Sorry for the late reply, but Heikki, can you get this Itanium
information into s_lock.h as a comment, particularly the information
about the +Ovolatile=__unordered flag?

---------------------------------------------------------------------------

On Mon, Dec 19, 2011 at 11:25:06PM +0200, Heikki Linnakangas wrote:
> On 19.12.2011 22:12, Tom Lane wrote:
> >Noah Misch<noah(at)leadboat(dot)com> writes:
> >>On Mon, Dec 19, 2011 at 05:09:11PM +0200, Heikki Linnakangas wrote:
> >>>That is not sufficient on platforms with a weak memory model, like Itanium.
> >
> >>Other processors may observe the lock as held after its release, but there's no
> >>correctness problem.
> >
> >How weak is the memory model, exactly?
> >
> >A correctness problem would ensue if out-of-order stores are possible,
> >ie other processors could observe the lock being freed (and then acquire
> >it) before seeing changes to shared variables that had been made while
> >holding the lock.
> >
> >I'm a little dubious that this applies to Itanium, because I don't see
> >any memory fence instruction in the TAS macro. If we needed one in
> >S_UNLOCK I'd rather expect there to be one in TAS as well.
>
> There's a pretty good manual called "Implementing Spinlocks on
> Itanium and PA-RISC" from HP at: http://h21007.www2.hp.com/portal/download/files/unprot/itanium/spinlocks.pdf.
> It says, in section "3.2.1 Try to get a spinlock":
>
> > Note also, that the ‘xchg’ instruction always
> > has the ‘acquire’ semantics, so it enforces the correct memory ordering.
>
> But the current implementation seems to be safe, anyway. In section
> "3.2.3 Freeing a spinlock", that manual says:
>
> > Note, that a simple assignment statement into a volatile variable
> will not work, as we are
> > assuming that the +Ovolatile=__unordered compile option is being used.
>
> So +Ovolatile=__unordered would allow the kind of optimization that
> I thought was possible, and we would have a problem if we used it.
> But the default is more conservative, and a simple assignment does
> work.
>
> I compiled the attached test program on an HP Itanium box, using the
> same flags you get from PostgreSQL's configure on that box. The
> relevant assembler output is:
>
> xchg4 r14 = [r15], r14 // M [slocktest.c: 66/3]
> //file/line/col slocktest.c/67/3
> ld1.acq r16 = [r11] // M [slocktest.c: 67/3]
> nop.i 0 // I
> //file/line/col slocktest.c/68/3
> st1.rel [r11] = r10 ;; // M [slocktest.c: 68/3]
> //file/line/col slocktest.c/69/3
> st4.rel [r15] = r0 // M [slocktest.c: 69/3]
> //file/line/col slocktest.c/70/1
>
>
> The trick I missed is that the compiler attaches .rel to all the
> stores and .acq to all the loads through a volatile pointer. gcc
> seems to do the same. So we're safe.
>
>
> It would be interesting to test whether using +Ovolatile=__unordered
> would make a difference to performance. Tacking those memory fence
> modifiers to all the volatile loads and stores might be expensive.
>
> --
> Heikki Linnakangas
> EnterpriseDB http://www.enterprisedb.com

> #include <stdio.h>
>
> #if defined(__GNUC__) || defined(__INTEL_COMPILER)
> #if defined(__ia64__) || defined(__ia64) /* Intel Itanium */
> #define HAS_TEST_AND_SET
>
> typedef unsigned int slock_t;
>
> #define TAS(lock) tas(lock)
>
> /* On IA64, it's a win to use a non-locking test before the xchg proper */
> #define TAS_SPIN(lock) (*(lock) ? 1 : TAS(lock))
>
> #ifndef __INTEL_COMPILER
>
> static __inline__ int
> tas(volatile slock_t *lock)
> {
> long int ret;
>
> __asm__ __volatile__(
> " xchg4 %0=%1,%2 \n"
> : "=r"(ret), "+m"(*lock)
> : "r"(1)
> : "memory");
> return (int) ret;
> }
>
> #else /* __INTEL_COMPILER */
>
> static __inline__ int
> tas(volatile slock_t *lock)
> {
> int ret;
>
> ret = _InterlockedExchange(lock,1); /* this is a xchg asm macro */
>
> return ret;
> }
>
> #endif /* __INTEL_COMPILER */
> #endif /* __ia64__ || __ia64 */
> #endif /* defined(__GNUC__) || defined(__INTEL_COMPILER) */
>
> #if defined(__hpux) && defined(__ia64) && !defined(__GNUC__)
>
> #define HAS_TEST_AND_SET
>
> typedef unsigned int slock_t;
>
> #include <ia64/sys/inline.h>
> #define TAS(lock) _Asm_xchg(_SZ_W, lock, 1, _LDHINT_NONE)
> /* On IA64, it's a win to use a non-locking test before the xchg proper */
> #define TAS_SPIN(lock) (*(lock) ? 1 : TAS(lock))
>
> #endif /* HPUX on IA64, non gcc */
>
> slock_t lock;
> char shared2;
>
> int main(int argc, char **argv)
> {
> volatile char *p = &shared2;
> char local;
>
> TAS(&lock);
> local = *p;
> *p = 123;
> *((volatile slock_t *) &lock) = 0;
> }

>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2012-02-03 06:45:13 Re: BUG #6425: Bus error in slot_deform_tuple
Previous Message Steven Schlansker 2012-02-02 23:30:17 Re: [JDBC] BUG #6293: JDBC driver performance