From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Should io_method=worker remain the default? |
Date: | 2025-09-03 06:47:48 |
Message-ID: | d68e2a4f8c356107e5167408ad80eaa2fac0f57d.camel@j-davis.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Has there already been a discussion about leaving the default as
io_method=worker? There was an Open Item for this, which was closed as
"Won't Fix", but the links don't explain why as far as I can see.
I tested a concurrent scan-heavy workload (see below) where the data
fits in memory, and "worker" seems to be 30% slower than "sync" with
default settings.
I'm not suggesting that AIO overall is slow -- on the contrary, I'm
excited about AIO. But if it regresses in some cases, we should make a
conscious choice about the default and what kind of tuning advice needs
to be offered.
I briefly tried tuning to see if a different io_workers value would
solve the problem, but no luck.
The good news is that io_uring seemed to solve the problem.
Unfortunately, that's platform-specific, so it can't be the default. I
didn't dig in very much, but it seemed to be at least as good as "sync"
mode for this workload.
Regards,
Jeff Davis
Test summary: 32 connections each perform repeated sequential scans.
Each connection scans a different 1GB partition of the same table. I
used partitioning and a predicate to make it easier to script in
pgbench.
Test details:
Machine:
AMD Ryzen 9 9950X 16-Core Processor
64GB RAM
Local storage, NVMe SSD
Ubuntu 24.04 (Linux 6.11, liburing 2.5)
Note: the storage didn't matter much, because the data fits in
memory. To get consistent results, when changing between data
directories for the 17 and 18 tests, I had to drop the filesystem cache
first to make room, then run a few scans to warm it with the data from
the right data directory.
For simplicity I disabled parallel query, but that didn't seem to have
a big effect. Everything else was set to the default.
Setup (checksums enabled):
=> create table t(sid int8, c0 int8, c1 int8, c2 int8, c3 int8, c4
int8, c5 int8, c6 int8, c7 int8) partition by range (sid);
$ (for i in `seq 0 31`; do
echo "create table t$(printf "%02d" $i) partition of t for
values from ($i) to ($((i+1)));";
done) | ./bin/psql postgres
$ (for i in `seq 0 31`; do
echo "insert into t$(printf "%02d" $i) select $i, 0, 1, 2, 3,
4, 5, 6, 7 from generate_series(0, 10000000);";
done) | ./bin/psql postgres
=> vacuum analyze; checkpoint;
Script count.sql:
SELECT COUNT(*) FROM t WHERE sid=:client_id;
pgbench:
./bin/pgbench --dbname=postgres -M prepared -n -c 32 -T 60 \
-f count.sql
Results:
PG17:
tps = 36.209048
PG18 (io_method=sync)
tps = 34.014890
PG18 (io_method=worker io_workers=3)
tps = 23.938509
PG18 (io_method=worker io_workers=16)
tps = 16.734360
PG18 (io_method=io_uring)
tps = 35.546825
From | Date | Subject | |
---|---|---|---|
Next Message | Chao Li | 2025-09-03 06:56:44 | Re: SQL:2023 JSON simplified accessor support |
Previous Message | Andrey Borodin | 2025-09-03 06:47:28 | Re: VM corruption on standby |