postmaster uses more CPU in 18 beta1 with io_method=io_uring

From: MARK CALLAGHAN <mdcallag(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: postmaster uses more CPU in 18 beta1 with io_method=io_uring
Date: 2025-06-03 19:24:38
Message-ID: CAFbpF8OA44_UG+RYJcWH9WjF7E3GA6gka3gvH6nsrSnEe9H0NA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

When measuring the time to create a connection, it is ~2.3X longer with
io_method=io_uring then with io_method=sync (6.9ms vs 3ms), and the
postmaster process uses ~3.5X more CPU to create connections.

The reproduction case so far is my usage of the Insert Benchmark on a large
server with 48 cores. I need to fix the benchmark client -- today it
creates ~1000 connections/s to run a monitoring query in between every 100
queries and the extra latency from connection create makes results worse
for one of the benchmark steps. While I can fix the benchmark client to
avoid this, I am curious about the extra latency in connection create.

I used "perf record -e cycles -F 333 -g -p $pidof_postmaster -- sleep 30"
but I have yet to find a big difference from the reports generated with
that for io_method=io_uring vs =sync. It shows that much time is spent in
the kernel dealing with the VM (page tables, etc).

The server runs Ubuntu 22.04.4. I compiled the Postgres 18beta1 release
from source via:
./configure --prefix=$pfx --enable-debug CFLAGS="-O2
-fno-omit-frame-pointer" --with-lz4 --with-liburing

Output from configure includes:
checking whether to build with liburing support... yes
checking for liburing... yes

io_uring support was installed via: sudo apt install liburing-dev and I
have 2.1-2build1
libc is Ubuntu GLIBC 2.35-0ubuntu3.10
gcc is 11.4.0

More performance info is here:
https://mdcallag.github.io/reports/25_06_01.pg.all.mem.hetz/all.html#summary

The config files I used only differ WRT io_method
* io_method=sync -
https://github.com/mdcallag/mytools/blob/master/bench/conf/arc/may25.hetzner/pg18b1git_o2nofp/conf.diff.cx10b_c32r128
* io_method=workers -
https://github.com/mdcallag/mytools/blob/master/bench/conf/arc/may25.hetzner/pg18b1git_o2nofp/conf.diff.cx10cw4_c32r128
* io_method=io_uring -
https://github.com/mdcallag/mytools/blob/master/bench/conf/arc/may25.hetzner/pg18b1git_o2nofp/conf.diff.cx10d_c32r128

The symptoms are:
* ~20% reduction in point queries/s with io_method=io_uring vs =sync,
=workers or Postgres 17.4, and the issue here is not that SELECT
performance has changed, it is that my benchmark client sometimes creates
connections in between running queries and the new latency from that for
io_method=io_uring hurts throughput
* CPU/query and context switches /query are similar, with io_uring the
CPU/query might be ~4% larger

From sampled thread stacks of the postmaster when I use io_uring the common
stack is:
arch_fork,__GI__Fork,__libc_fork,fork_process,postmaster_child_launch,BackendStartup,ServerLoop,PostmasterMain,main

While the typical stack with io_method=sync is:
epoll_wait,WaitEventSetWaitBlock,WaitEventSetWait,ServerLoop,PostmasterMain,main

I run "ps" during each benchmark step and on example of what I see during a
point query benchmarks step (qp100.L2) with io_method=uring is below. The
benchmark step runs for 300 seconds.
---> from the start of the step
mdcallag 3762684 0.9 1.5 103027276 2031612 ? Ss 03:12 0:14
/home/mdcallag/d/pg18beta1_o2nofp/bin/postgres -D /data/m/pg
---> from the end of the step
mdcallag 3762684 15.9 1.5 103027276 2031612 ? Rs 03:12 5:04
/home/mdcallag/d/pg18beta1_o2nofp/bin/postgres -D /data/m/pg

And from top I see:
---> with =io_uring
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
3762684 mdcallag 20 0 98.3g 1.9g 1.9g R 99.4 1.5 3:04.87
/home/mdcallag/d/pg18beta1_o2nofp/bin/postgres -D /data/m/pg

--> with =sync
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
2913673 mdcallag 20 0 98.3g 1.9g 1.9g S 28.3 1.5 0:54.13
/home/mdcallag/d/pg18beta1_o2nofp/bin/postgres -D /data/m/pg

The postmaster had used 0:14 (14 seconds) of CPU time by the start of the
benchmark step and 5:04 (304 seconds) by the end. For the same step with
io_method=sync it was 0:05 at the start and 1:27 at the end. So the
postmaster used ~290 seconds of cpu with =io_uring vs ~82 with =sync, which
is ~3.5X more CPU on the postmaster per connection attempt.

From vmstat what I see is that some of the rates (cs = context switches, us
= user CPU) are ~20% smaller with =io_uring, which is reasonable given that
the throughput is also ~20% smaller. But sy (system CPU) is not 20% smaller
because of the overhead from all of those calls to fork (or clone).

Avg rates from vmstat
cs us sy us+sy
492961 25.0 14.0 39.0 --> with =sync
401233 20.1 14.0 34.1 ---> with =io_uring

--
Mark Callaghan
mdcallag(at)gmail(dot)com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sami Imseih 2025-06-03 19:32:02 Re: Add log_autovacuum_{vacuum|analyze}_min_duration
Previous Message David E. Wheeler 2025-06-03 19:10:28 Re: PATCH: jsonpath string methods: lower, upper, initcap, l/r/btrim, replace, split_part