Re: Condition variable live lock

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Condition variable live lock
Date: 2018-01-05 06:33:10
Message-ID: CAEepm=0zg2dSqHdMbf6ATB9Q4MPCBQfuH23z_00bWEV3wbX-rw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 5, 2018 at 7:10 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I thought of another possible issue, though. In the situation where
> someone else has removed our sentinel (presumably, by issuing
> ConditionVariableSignal just before we were about to remove the
> sentinel), my patch assumes we can just do nothing. But it seems
> like that amounts to losing one signal. Whoever the someone else
> was probably expected to awaken a waiter, and now that won't happen.

Yeah, that's bad.

> Should we rejigger the logic so that it awakens one additional waiter
> (if there is one) after detecting that someone else has removed the
> sentinel? Obviously, this trades a risk of loss of wakeup for a risk
> of spurious wakeup, but presumably the latter is something we can
> cope with.

One detail is that the caller of ConditionVariableSignal() got a true
return value when it took out the sentinel (indicating that someone
received the signal), and now when you call ConditionVariableSignal()
because !aminlist there may be no one there. I'm not sure if that's a
problem. For comparison, pthread_cond_signal() doesn't tell you if
you actually signalled anyone. Maybe the only reason we have that
return code is so that ConditionVariableBroadcast() can use it the way
it does in master...

An alternative would be to mark sentinel entries somehow so that
signallers can detect them and signal again, but that's not
backpatchable.

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2018-01-05 06:42:36 Re: [HACKERS] SQL/JSON in PostgreSQL
Previous Message Craig Ringer 2018-01-05 06:22:59 Re: BUGFIX: standby disconnect can corrupt serialized reorder buffers