Re: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs

From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs
Date: 2011-08-08 15:28:45
Message-ID: CAEYLb_VUgtq=aSZkTZ5Z45L0-x0VTFOpQpNQPt06Gwr6xmd9aA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8 August 2011 13:47, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On the flip side, I'm not sure that anyone ever really expected that a
> latch could be safely used this way.  Normally, I'd expect the flag to
> be some piece of state protected by an LWLock, and I think that ought
> to be OK provided that the lwlock is released before setting or
> checking the latch.

I'm inclined to agree. FWIW I'll point out that in addition to your
point, it's also worth remembering that a shared latch isn't always
necessary, as in the case of the archiver, so this shouldn't
fundamentally shake our confidence in the latch implementation.

> Maybe we should try to document the contract here
> a little better; I think it's just that there must be a full memory
> barrier (such as LWLockRelease) in both processes involved in the
> interaction.

Or, maybe we should think about pushing that into the latch
implementation, while providing an interface that is easy to use
correctly and hard to use incorrectly. The latch code already has a
pretty complicated contract, and I had to bend over backwards using it
in the AVLauncher patch that I submitted to the upcoming commitfest.
Weakening or equivocating the contract any further should be
considered a last resort.

> I've been thinking about this too and actually went so far as to do
> some research and put together something that I hope covers most of
> the interesting cases.  The attached patch is pretty much entirely
> untested, but reflects my present belief about how things ought to
> work.

That's interesting. Nice work.

It seems likely that Windows will support ARM in some way in the next
couple of years, so it's good that you're handling this using a win32
abstraction where available. Of course, Itanic is currently supported
on Windows, though not by us, and it is obviously never going to be
worth pursuing such support. The point is that it generally isn't safe
to assume that we'll only ever support Windows on x86 and x86_64.

I'd like to see a #warning where you fall back on acquiring and
releasing a spinlock, or at least a #warning if that in turn falls
back on the semaphore-based spinlock implementation. Granted, that
directive isn't in the C standard, but it's pretty portable in
practice. If it breaks the build on some very esoteric platform, that
may not be a bad thing - in fact, maybe you should use the portable
#error directive to make sure it breaks. I'd rather have someone
complain about that than lots more people complain about Postgres
being inexplicably slow, or worse not complain and dismiss Postgres
out of hand for their use-case.

By the way, I don't think that the comment "Won't work on Visual
Studio 2003" is accurate. Everything you do for the
defined(WIN32_ONLY_COMPILER) case is documented as working with 2003.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-08-08 15:33:29 Re: fstat vs. lseek
Previous Message Jesper Krogh 2011-08-08 15:11:12 Re: mosbench revisited