Re: sinval synchronization considered harmful

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Noah Misch <noah(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: sinval synchronization considered harmful
Date: 2011-07-26 17:43:08
Message-ID: 13077.1311702188@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On further reflection, I don't see that this helps: it just moves the
> problem around. With resetState as a separate variable, nextMsgNum is
> never changed by anyone other than the owner, so we can never have a
> stale load. But if we overload nextMsgNum to also indicate whether
> our state has been reset, then there's a race between when we load
> nextMsgNum and when we load maxMsgNum (instead of code I posted
> previously, which has a race between when we load resetState and when
> we load maxMsgNum). Now, as you say, it seems really, really
> difficult to hit that in practice, but I don't see a way of getting
> rid of the theoretical possibility without either (1) a spinlock or
> (2) a fence. (Of course, on x86, the fence could be optimized down to
> a compiler barrier.) I guess the question is "should we worry about
> that?".

Uh, yes. I've lost count of the number of times I've seen people hit
one-instruction-wide race condition windows, SIGSEGV crash based on
accessing only one byte past the valid data structure, etc etc.
The worst thing about this type of bug is that you can't reproduce the
failure when you try; doesn't mean your users won't see it.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message pasman pasmański 2011-07-26 17:46:03 Re: problem with compiling beta3 on mingw32+WinXP
Previous Message Robert Haas 2011-07-26 17:36:26 Re: sinval synchronization considered harmful