Re: Condition variable live lock

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Condition variable live lock
Date: 2017-12-28 23:16:20
Message-ID: CAEepm=1_S2Ly3Q53yViq29RVJmvaUw8hXs5_ekg_E1uHrNtXGQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 22, 2017 at 4:46 PM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> while (ConditionVariableSignal(cv))
> ++nwoken;
>
> The problem is that another backend can be woken up, determine that it
> would like to wait for the condition variable again, and then get
> itself added to the back of the wait queue *before the above loop has
> finished*, so this interprocess ping-pong isn't guaranteed to
> terminate. It seems that we'll need something slightly smarter than
> the above to avoid that.

Here is one way to fix it: track the wait queue size and use that
number to limit the wakeup loop. See attached.

That's unbackpatchable though, because it changes the size of struct
ConditionVariable, potentially breaking extensions compiled against an
earlier point release. Maybe this problem won't really cause problems
in v10 anyway? It requires a particular interaction pattern that
barrier.c produces but more typical client code might not: the awoken
backends keep re-adding themselves because they're waiting for
everyone (including the waker) to do something, but the waker is stuck
in that broadcast loop.

Thoughts?

--
Thomas Munro
http://www.enterprisedb.com

Attachment Content-Type Size
fix-cv-livelock.patch application/octet-stream 3.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2017-12-29 01:18:00 Re: The pg_indent on on ftp is outdated
Previous Message Bossart, Nathan 2017-12-28 22:46:18 Re: BUG #14941: Vacuum crashes