Re: Optimize LISTEN/NOTIFY

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: Joel Jacobson <joel(at)compiler(dot)org>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimize LISTEN/NOTIFY
Date: 2025-10-29 07:05:42
Message-ID: DF4BDA30-CC41-4BAF-9852-E399C3F273EC@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Oct 29, 2025, at 05:45, Joel Jacobson <joel(at)compiler(dot)org> wrote:
>
> On Tue, Oct 28, 2025, at 07:46, Chao Li wrote:
>>>> But anyway, we should run some load tests to verify every solution to
>>>> see how much they really improve. Do you already have or plan to work
>>>> on a load test script?
>>>
>>> Yes, I'm currently working on a combined benchmark / correctness test suite.
>>>
>>
>> Cool. Then we can run the benchmark and decide.
>
> I found a concurrency bug in v21 that could cause missed wakeup when a
> backend would UNLISTEN on the last channel, which called
> asyncQueueUnregister, and if wakeupPending was at that time already set,
> then it wouldn't get reset, since in ProcessIncomingNotify we return
> early if (listenChannels == NIL), so we would never clear wakeupPending
> which happens in asyncQueueReadAllNotifications.
>
> Fixed by clearing wakeupPending in asyncQueueUnregister:
>
> @@ -1597,6 +1597,7 @@ asyncQueueUnregister(void)
> /* Mark our entry as invalid */
> QUEUE_BACKEND_PID(MyProcNumber) = InvalidPid;
> QUEUE_BACKEND_DBOID(MyProcNumber) = InvalidOid;
> + QUEUE_BACKEND_WAKEUP_PENDING(MyProcNumber) = false;
> /* and remove it from the list */
> if (QUEUE_FIRST_LISTENER == MyProcNumber)
> QUEUE_FIRST_LISTENER = QUEUE_NEXT_LISTENER(MyProcNumber);
>
> /Joel<0001-optimize_listen_notify-v22.patch><0002-optimize_listen_notify-v22.patch>

I think the current implementation still has a race problem.

Let’s say notifier N1 notifies listener’s L1 to read message.
L1 starts to read: it acquires the look, gets reading range, then releases the lock, start performs reading without holding the lock.
Notifier N2 comes, N2 doesn’t have anything L1 is interested in. N2 now holds the look, when it checks "if (QUEUE_POS_EQUAL(pos, queueHeadBeforeWrite))”, here comes the race. Because the lock is in N2’s hand, L1 cannot get the lock to update its pos, so "if (QUEUE_POS_EQUAL(pos, queueHeadBeforeWrite))” will not be satisfied, so direct advancement won’t happen.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2025-10-29 07:51:00 MSVC: Improve warning options set
Previous Message Peter Eisentraut 2025-10-29 07:03:53 Re: remove pg_restrict workaround