Re: Should io_method=worker remain the default?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Should io_method=worker remain the default?
Date: 2025-09-03 15:55:02
Message-ID: adywrhdn5zcfeldzug2pkzbdxizactr46goh4doiztvexrqomy@lgvl26gui63a
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2025-09-02 23:47:48 -0700, Jeff Davis wrote:
> Has there already been a discussion about leaving the default as
> io_method=worker? There was an Open Item for this, which was closed as
> "Won't Fix", but the links don't explain why as far as I can see.

> I tested a concurrent scan-heavy workload (see below) where the data
> fits in memory, and "worker" seems to be 30% slower than "sync" with
> default settings.
>
> Test summary: 32 connections each perform repeated sequential scans.
> Each connection scans a different 1GB partition of the same table. I
> used partitioning and a predicate to make it easier to script in
> pgbench.

32 parallel seq scans of a large relations, with default shared buffers, fully
cached in the OS page cache, seems like a pretty absurd workload. That's not
to say we shouldn't spend some effort to avoid regressions for it, but it also
doesn't seem to be worth focusing all that much on it. Or is there a
real-world scenario this actually emulating?

I think the regression is not due to anything inherent to worker, but due to
pressure on AioWorkerSubmissionQueueLock - at least that's what I'm seeing on
a older two socket machine. It's possible the bottleneck is different on a
newer machine (my newer workstation is busy on another benchmark rn).

*If* we actually care about this workload, we can make
pgaio_worker_submit_internal() acquire that lock conditionally, and perform
the IOs synchronously instead. That seems to help here, sufficiently to make
worker the same as sync - although plenty contention remains, from "the worker
side", which can't just acquire the lock conditionally.

But I'm really not sure doing > 30GB/s of repeated reads from the page cache
is a particularly useful thing to optimize. I see a lot of unrelated
contention, e.g. on the BufferMappingLock - unsurprising, it's a really
extreme workload...

If I instead just increase s_b, I get 2x the throughput...

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2025-09-03 15:58:28 Re: Explicitly enable meson features in CI
Previous Message Álvaro Herrera 2025-09-03 15:53:55 Re: Should io_method=worker remain the default?