Re: Proposal: Out-of-Order NOTIFY via GUC to Improve LISTEN/NOTIFY Throughput

From: Rishu Bagga <rishu(dot)postgres(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Joel Jacobson <joel(at)compiler(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "nik(at)postgres(dot)ai" <nik(at)postgres(dot)ai>
Subject: Re: Proposal: Out-of-Order NOTIFY via GUC to Improve LISTEN/NOTIFY Throughput
Date: 2025-09-04 22:53:27
Message-ID: CAK80=jipUfGC+UQSzeA4oCP9daRtHZGm2SQZWLxC9NWmVTDtRQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 18, 2025 at 10:06 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> After thinking about this for awhile, I have a rough idea of
> something we could do to improve parallelism of NOTIFY.
> As a bonus, this'd allow processes on hot standby servers to
> receive NOTIFYs from processes on the primary, which is a
> feature many have asked for.
>
> The core thought here was to steal some implementation ideas
> from two-phase commit. I initially thought maybe we could
> remove the SLRU queue entirely, and maybe we can still find
> a way to do that, but in this sketch it's still there with
> substantially reduced traffic.
>
> The idea basically is to use the WAL log rather than SLRU
> as transport for notify messages.
>
> 1. In PreCommit_Notify(), gather up all the notifications this
> transaction has emitted, and write them into a WAL log message.
> Remember the LSN of this message. (I think this part should be
> parallelizable, because of work that's previously been done to
> allow parallel writes to WAL.)
>
> 2. When writing the transaction's commit WAL log entry, include
> the LSN of the previous notify-data entry.
>
> 3. Concurrently with writing the commit entry, send a message
> to the notify SLRU queue. This would be a small fixed-size
> message with the transaction's XID, database ID, and the LSN
> of the notify-data WAL entry. (The DBID is there to let
> listeners quickly ignore traffic from senders in other DBs.)
>
> 4. Signal listening backends to check the queue, as we do now.
>
> 5. Listeners read the SLRU queue and then, if in same database,
> pull the notify data out of the WAL log. (I believe we already
> have enough infrastructure to make that cheap, because 2-phase
> commit does it too.)
>
> In the simplest implementation of this idea, step 3 would still
> require a global lock, to ensure that SLRU entries are made in
> commit order. However, that lock only needs to be held for the
> duration of step 3, which is much shorter than what happens now.

Attached is an initial patch that implements this idea.

There is still some
work to be done around how to handle truncation / vacuum for the new
approach, and testing replication of notifications onto a reader instance.

That being said, I ran some basic benchmarking to stress concurrent
notifications.

With the following sql script, I ran
pgbench -T 100 -c 100 -j 8 -f pgbench_transaction_notify.sql -d postgres

BEGIN;
INSERT INTO test VALUES(1);
NOTIFY benchmark_channel, 'transaction_completed';
COMMIT;

With the patch 3 runs showed the following TPS:

tps = 66372.705917
tps = 63445.909465
tps = 64412.544339

Without the patch, we got the following TPS:

tps = 30212.390982
tps = 30908.865812
tps = 29191.388601

So, there is about a 2x increase in TPS at 100 connections, which establishes
some promise in the approach.

Additionally, this would help solve the issue being discussed in a
separate thread [1],
where listeners currently rely on the transaction log to verify if a
transaction that it reads
has indeed committed, but it is possible that the portion of the
transaction log has
been truncated by vacuum.

Would appreciate any thoughts on the direction of this patch.

Thanks, Rishu

[1] https://www.postgresql.org/message-id/CAK98qZ3wZLE-RZJN_Y%
2BTFjiTRPPFPBwNBpBi5K5CU8hUHkzDpw%40mail.gmail.com

Attachment Content-Type Size
notify-through-wal.patch application/octet-stream 33.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rishu Bagga 2025-09-04 23:02:16 Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
Previous Message Masahiko Sawada 2025-09-04 22:30:33 Re: Logical Replication of sequences