Re: Optimize LISTEN/NOTIFY

From: "Joel Jacobson" <joel(at)compiler(dot)org>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimize LISTEN/NOTIFY
Date: 2025-10-29 10:33:07
Message-ID: 7556f0d4-03fd-451a-bd34-5f62b424319a@app.fastmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 29, 2025, at 08:05, Chao Li wrote:
>> On Oct 29, 2025, at 05:45, Joel Jacobson <joel(at)compiler(dot)org> wrote:
>> I found a concurrency bug in v21 that could cause missed wakeup when a
>> backend would UNLISTEN on the last channel, which called
>> asyncQueueUnregister, and if wakeupPending was at that time already set,
>> then it wouldn't get reset, since in ProcessIncomingNotify we return
>> early if (listenChannels == NIL), so we would never clear wakeupPending
>> which happens in asyncQueueReadAllNotifications.
>>
>> Fixed by clearing wakeupPending in asyncQueueUnregister:
>>
>> @@ -1597,6 +1597,7 @@ asyncQueueUnregister(void)
>> /* Mark our entry as invalid */
>> QUEUE_BACKEND_PID(MyProcNumber) = InvalidPid;
>> QUEUE_BACKEND_DBOID(MyProcNumber) = InvalidOid;
>> + QUEUE_BACKEND_WAKEUP_PENDING(MyProcNumber) = false;
>> /* and remove it from the list */
>> if (QUEUE_FIRST_LISTENER == MyProcNumber)
>> QUEUE_FIRST_LISTENER = QUEUE_NEXT_LISTENER(MyProcNumber);
>>
>> /Joel<0001-optimize_listen_notify-v22.patch><0002-optimize_listen_notify-v22.patch>
>
> I think the current implementation still has a race problem.
>
> Let’s say notifier N1 notifies listener’s L1 to read message.
> L1 starts to read: it acquires the look, gets reading range, then
> releases the lock, start performs reading without holding the lock.
> Notifier N2 comes, N2 doesn’t have anything L1 is interested in. N2 now
> holds the look, when it checks "if (QUEUE_POS_EQUAL(pos,
> queueHeadBeforeWrite))”, here comes the race. Because the lock is in
> N2’s hand, L1 cannot get the lock to update its pos, so "if
> (QUEUE_POS_EQUAL(pos, queueHeadBeforeWrite))” will not be satisfied, so
> direct advancement won’t happen.

I'm not sure I agree that qualifies as a race "problem" per se, since I
think that just sounds like a case where we would do an unnecessary
wakeup, right?

Without more sophisticated data structures (e.g. skip ranges) and
increased code complexity, there will always be cases where we will by
do unnecessary wakeups, which IMO need not be a design goal to
completely avoid, until we have benchmark data that indicates otherwise.

I think we should iterate by first trying to reason about correctness of
the code, trying to prove/disprove if a notifications could ever end up
not being delivered. The bug I fixed in v22 is an example of such a
case, that would cause a listening backend to never be awaken, since
notifiers would not signal it due to the pending wake that was not
cleared.

I wonder if there could be more such serious bugs in the current code. I
will focus my efforts now trying to answer that question. Would be
really nice if we could find a way to reason formally about this. I've
been looking into the P programming language, which seems suitable for
modeling and verifying these kind of asynchronous concurrency protocols,
I will give it a try.

/Joel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Arne Roland 2025-10-29 10:43:10 Re: apply_scanjoin_target_to_paths and partitionwise join
Previous Message Zhijie Hou (Fujitsu) 2025-10-29 10:31:13 RE: Logical Replication of sequences