Re: Proposal: Out-of-Order NOTIFY via GUC to Improve LISTEN/NOTIFY Throughput

From: "Joel Jacobson" <joel(at)compiler(dot)org>
To: "Rishu Bagga" <rishu(dot)postgres(at)gmail(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "nik(at)postgres(dot)ai" <nik(at)postgres(dot)ai>
Subject: Re: Proposal: Out-of-Order NOTIFY via GUC to Improve LISTEN/NOTIFY Throughput
Date: 2025-07-30 09:03:59
Message-ID: 7c8c61c2-4dbf-4961-9e0c-a5ec77d8f846@app.fastmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 22, 2025, at 14:48, Joel Jacobson wrote:
> Benchmark from original post:
...
> For a normal PostgreSQL with the CPU and storage on the same physical machine,
> I think the results above clearly demonstrate that the global exclusive lock
> is at least not the bottleneck, which I strongly believe instead is the flood of
> unnecessary kill(pid, SIGUSR1) syscalls.

I was wrong here. This is much more complex than I initially thought.

After some additional benchmarking and analyzing perf results,
I realize the bottleneck depends on the workload,
which is either the kill() syscalls *or* the heavyweight lock.

Here is one scenario where the heavyweight lock actually *is* the bottleneck:

1 session does LISTEN
pgbench -f notify.sql -c 1000 -j 8 -T 60 -n

Simply commenting out the heavyweight lock gives a dramatic difference:
tps = 7679 (with heavyweight lock; in commit order)
tps = 95430 (without heavyweight lock; not in commit order)

My conclusion so far is that we would greatly benefit both from
reducing/eliminating kill() syscalls, as well as finding ways to avoid
the heavyweight lock while preserving commit order.

/Joel

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jean-Christophe Arnu 2025-07-30 09:14:46 Re: restore_command return code behaviour
Previous Message Fabrice Chapuis 2025-07-30 08:33:25 Re: pg_basebackup and pg_switch_wal()