Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Thom Brown <thom(at)linux(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)
Date: 2011-09-23 01:11:54
Message-ID: CA+TgmoYFdgeizNusUZDmco7v-3cWKXFZTKC8s1jyVnpK4iYEDg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 22, 2011 at 7:46 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Thu, 2011-09-22 at 19:12 -0400, Robert Haas wrote:
>> But since you asked... as I
>> understand it, unless you're running on Alpha, you actually don't need
>> a barrier here at all, because all currently-used CPUs other than
>> alpha "respect data dependencies", which means that if q->num_items is
>> used to compute an address to be read from memory, the CPU will ensure
>> that the read of that address is performed after the read of the value
>> used to compute the address.  At least that's my understanding.  But
>> Alpha won't.
>
> I'm still trying to figure out how it's even possible to read an address
> that's not computed yet. Something sounds strange about that...

That's because it's strange. You might have a look at
http://www.linuxjournal.com/article/8212

Basically, it seems like on Alpha, the CPU is allowed to do pretty
much anything short of entirely fabricating the value that gets
returned.

> I think it might have more to do with branch prediction or something
> else. In your example, the address is not computed from q->num_items
> directly, it's computed using "i". But that branch being followed is
> dependent on a comparison with q->num_items. Maybe that's the dependency
> that's not respected?

You might be right. I can't swear I understand exactly what goes
wrong there; in fact I'm not 100% sure that you don't need a
read-barrier on things less crazy than Alpha. I speculate that the
problem is something this: q->num_items is in some cache line and all
the elements of q->items is in some other cache line, and you see that
you're about to use both of those so you suck the cache lines into
memory. But because one cache bank is busier than the other, you get
q->items first. And between the time you get the cache line
containing q->items and the time you get the cache line containing
q->num_items, someone insert an item into the queue, and now you're
hosed, because you have the old array contents with the new array
length.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thom Brown 2011-09-23 01:26:53 Re: Unlogged vs. In-Memory
Previous Message Jeff Davis 2011-09-22 23:46:17 Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)