Re: Optimize LISTEN/NOTIFY

From: "Joel Jacobson" <joel(at)compiler(dot)org>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimize LISTEN/NOTIFY
Date: 2025-10-14 16:40:11
Message-ID: 8c71183a-0d28-4bcf-a806-78446ff95404@app.fastmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Oct 11, 2025, at 09:43, Joel Jacobson wrote:
> On Sat, Oct 11, 2025, at 08:43, Joel Jacobson wrote:
>> In addition to previously suggested optimization, there is another major
...
>> I'm not entirely sure this approach is correct though

Having investigated this, the "direct advancement" approach seems
correct to me.

(I understand the exclusive lock in PreCommit_Notify on NotifyQueueLock
is of course needed because there are other operations that don't
acquire the heavyweight-lock, that take shared/exclusive lock on
NotifyQueueLock to read/modify QUEUE_HEAD, so the exclusive lock on
NotifyQueueLock in PreCommit_Notify is needed, since it modifies the
QUEUE_HEAD.)

Given all the experiments since my earlier message, here is a fresh,
self-contained write-up:

This series has two patches:

* 0001-optimize_listen_notify-v16.patch:
Improve test coverage of async.c. Adds isolation specs covering
previously untested paths (subxact LISTEN reparenting/merge/abort,
simple NOTIFY reparenting, notification_match dedup, and an array-growth
case used by the follow-on patch.

* 0002-optimize_listen_notify-v16.patch:
Optimize LISTEN/NOTIFY by maintaining a shared channel map and using
direct advancement to avoid useless wakeups.

Problem
-------

Today SignalBackends wakes all listeners in the same database, with no
knowledge of which backends listen on which channels. When some backends
are listening on different channels, each NOTIFY causes unnecessary
wakeups and context switches, which can become the bottleneck in
workloads.

Overview of the solution (patch 0002)
-------------------------------------

* Introduce a lazily-created DSA+dshash map (dboid, channel) ->
[ProcNumber] (channelHash). AtCommit_Notify maintains it for
LISTEN/UNLISTEN, and SignalBackends consults it to signal only
listeners on the channels notified within the transaction.
* Add a per-backend wakeupPending flag to suppress duplicate signals.
* Direct advancement: while queuing, PreCommit_Notify records the queue
head before and after our writes. Writers are globally serialized, so
the interval [oldHead, newHead) contains only our entries.
SignalBackends advances any backend still at oldHead directly to
newHead, avoiding a pointless wakeup.
* Keep the queue healthy by signaling backends that have fallen too far
behind (lag >= QUEUE_CLEANUP_DELAY) so the global tail can advance.
* pg_listening_channels and IsListeningOn now read from channelHash.
* Add LWLock tranche NOTIFY_CHANNEL_HASH and wait event
NotifyChannelHash.

No user-visible semantic changes are intended; this is an internal
performance improvement.

Benchmark
---------

Using a patched pgbench (adds --listen-notify-benchmark; attached as
.txt to avoid confusing cfbot). Each run performs 10 000 round trips and
adds 100 idle listeners per iteration.

master (HEAD):

% ./pgbench_patched --listen-notify-benchmark --notify-round-trips=10000 --notify-idle-step=100

idle_listeners round_trips_per_sec max_latency_usec
0 32123.7 893
100 1952.5 1465
200 991.4 3438
300 663.5 2454
400 494.6 2950
500 398.6 3394
600 332.8 4272
700 287.1 4692
800 252.6 5208
900 225.4 5614
1000 202.5 6212

0002-optimize_listen_notify-v16.patch:

% ./pgbench_patched --listen-notify-benchmark --notify-round-trips=10000 --notify-idle-step=100

idle_listeners round_trips_per_sec max_latency_usec
0 31832.6 1067
100 32341.0 1035
200 31562.5 1054
300 30040.1 1057
400 29287.1 1023
500 28191.9 1201
600 28166.5 1019
700 26994.3 1094
800 26501.0 1043
900 25974.2 1005
1000 25720.6 1008

Benchmarked on MacBook Pro Apple M3 Max.

Files
-----

* 0001-optimize_listen_notify-v16.patch - tests only.
* 0002-optimize_listen_notify-v16.patch - implementation.
* pgbench-listen-notify-benchmark-patch.txt - adds --listen-notify-benchmark.

Feedback and review much welcomed.

/Joel

Attachment Content-Type Size
0001-optimize_listen_notify-v16.patch application/octet-stream 7.8 KB
0002-optimize_listen_notify-v16.patch application/octet-stream 35.4 KB
pgbench-listen-notify-benchmark-patch.txt text/plain 9.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2025-10-14 17:01:37 Re: Clarification on Role Access Rights to Table Indexes
Previous Message Tom Lane 2025-10-14 16:30:26 Re: Clarification on Role Access Rights to Table Indexes