From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Jeff Davis <pgsql(at)j-davis(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Should io_method=worker remain the default? |
Date: | 2025-09-03 15:55:02 |
Message-ID: | adywrhdn5zcfeldzug2pkzbdxizactr46goh4doiztvexrqomy@lgvl26gui63a |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2025-09-02 23:47:48 -0700, Jeff Davis wrote:
> Has there already been a discussion about leaving the default as
> io_method=worker? There was an Open Item for this, which was closed as
> "Won't Fix", but the links don't explain why as far as I can see.
> I tested a concurrent scan-heavy workload (see below) where the data
> fits in memory, and "worker" seems to be 30% slower than "sync" with
> default settings.
>
> Test summary: 32 connections each perform repeated sequential scans.
> Each connection scans a different 1GB partition of the same table. I
> used partitioning and a predicate to make it easier to script in
> pgbench.
32 parallel seq scans of a large relations, with default shared buffers, fully
cached in the OS page cache, seems like a pretty absurd workload. That's not
to say we shouldn't spend some effort to avoid regressions for it, but it also
doesn't seem to be worth focusing all that much on it. Or is there a
real-world scenario this actually emulating?
I think the regression is not due to anything inherent to worker, but due to
pressure on AioWorkerSubmissionQueueLock - at least that's what I'm seeing on
a older two socket machine. It's possible the bottleneck is different on a
newer machine (my newer workstation is busy on another benchmark rn).
*If* we actually care about this workload, we can make
pgaio_worker_submit_internal() acquire that lock conditionally, and perform
the IOs synchronously instead. That seems to help here, sufficiently to make
worker the same as sync - although plenty contention remains, from "the worker
side", which can't just acquire the lock conditionally.
But I'm really not sure doing > 30GB/s of repeated reads from the page cache
is a particularly useful thing to optimize. I see a lot of unrelated
contention, e.g. on the BufferMappingLock - unsurprising, it's a really
extreme workload...
If I instead just increase s_b, I get 2x the throughput...
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Jacob Champion | 2025-09-03 15:58:28 | Re: Explicitly enable meson features in CI |
Previous Message | Álvaro Herrera | 2025-09-03 15:53:55 | Re: Should io_method=worker remain the default? |