Re: Condition variables vs interrupts

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Shawn Debnath <sdn(at)amazon(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Subject: Re: Condition variables vs interrupts
Date: 2020-01-28 02:31:55
Message-ID: CA+hUKGLUTeQN7zeRXq0EMR_TNDKcY3BjAOgZMFcXE-Z8sMu8UA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 24, 2019 at 3:10 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> On Sat, Dec 21, 2019 at 2:10 PM Shawn Debnath <sdn(at)amazon(dot)com> wrote:
> > On Fri, Dec 20, 2019 at 12:05:34PM +1300, Thomas Munro wrote:
> > > I think we should probably just remove the unusual ResetLatch() call,
> > > rather than adding a CFI(). See attached. Thoughts?
> >
> > I agree: removing the ResetLatch() and having the wait event code deal
> > with it is a better way to go. I wonder why the reset was done in the
> > first place?

I have pushed this on master only.

> Thanks for the review. I was preparing to commit this today, but I
> think I've spotted another latch protocol problem in the stable
> branches only and I'd to get some more eyeballs on it. I bet it's
> really hard to hit, but ConditionVariableSleep()'s return path (ie
> when the CV has been signalled) forgets that the the latch is
> multiplexed and latches are merged: just because it was set by
> ConditionVariableSignal() doesn't mean it wasn't *also* set by die()
> or some other interrupt, and yet at the point we return, we've reset
> the latch and not run CFI(), and there's nothing to say the caller
> won't then immediately wait on the latch in some other code path, and
> sleep like a log despite the earlier delivery of (say) SIGTERM. If
> I'm right about that, I think the solution is to move that CFI down in
> the stable branches (which you already did for master in commit
> 1321509f).

I'm not going to back-patch this (ie the back-branches version from my
previous email) without more discussion, but I still think it's subtly
broken.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-01-28 02:44:06 Re: [HACKERS] Block level parallel vacuum
Previous Message Tom Lane 2020-01-28 02:11:57 Re: Allow to_date() and to_timestamp() to accept localized names