Re: [9.4 bug] The database server hangs with write-heavy workload on Windows

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: MauMau <maumau307(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [9.4 bug] The database server hangs with write-heavy workload on Windows
Date: 2014-10-13 15:26:34
Message-ID: 20141013152634.GN18020@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-10-13 17:56:10 +0300, Heikki Linnakangas wrote:
> So the gist of the problem is that LWLockRelease doesn't wake up
> LW_WAIT_UNTIL_FREE waiters, when releaseOK == false. It should, because a
> LW_WAIT_UNTIL FREE waiter is now free to run if the variable has changed in
> value, and it won't steal the lock from the other backend that's waiting to
> get the lock in exclusive mode, anyway.

I'm not a big fan of that change. Right now we don't iterate the waiters
if releaseOK isn't set. Which is good for the normal lwlock code because
it avoids pointer indirections (of stuff likely residing on another
cpu). Wouldn't it be more sensible to reset releaseOK in *UpdateVar()? I
might just miss something here.

>
> I noticed another potential bug: LWLockAcquireCommon doesn't use a volatile
> pointer when it sets the value of the protected variable:
>
> > /* If there's a variable associated with this lock, initialize it */
> > if (valptr)
> > *valptr = val;
> >
> > /* We are done updating shared state of the lock itself. */
> > SpinLockRelease(&lock->mutex);
>
> If the compiler or CPU decides to reorder those two, so that the variable is
> set after releasing the spinlock, things will break.

Good catch. As Robert says that should be fine with master, but 9.4
obviously needs it.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2014-10-13 15:35:18 Re: [PATCH] PostgreSQL 9.4 mmap(2) performance regression on FreeBSD...
Previous Message Kevin Grittner 2014-10-13 15:26:22 Re: bad estimation together with large work_mem generates terrible slow hash joins