From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Should io_method=worker remain the default? |
Date: | 2025-09-05 20:25:49 |
Message-ID: | a81f2f7ef34afc24a89c613671ea017e3651329c.camel@j-davis.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, 2025-09-03 at 11:55 -0400, Andres Freund wrote:
> I think the regression is not due to anything inherent to worker, but
> due to
> pressure on AioWorkerSubmissionQueueLock - at least that's what I'm
> seeing on
> a older two socket machine. It's possible the bottleneck is different
> on a
> newer machine (my newer workstation is busy on another benchmark rn).
I believe what's happening is that parallelism of the IO completion
work (e.g. checksum verification) is reduced. In worker mode, the
completion work is happening on the io workers (of which there are 3);
while in sync mode the completion work is happening in the backends (of
which there are 32).
There may be lock contention too, but I don't think that's the primary
issue.
I attached a test patch for illustration. It simplifies the code inside
the LWLock to enqueue/dequeue only, and simplifies and reduces the
wakeups by doing pseudo-random wakeups only when enqueuing. Reducing
the wakeups should reduce the number of signals generated without
hurting my case, because the workers are never idle. And reducing the
instructions while holding the LWLock should reduce lock contention.
But the patch barely makes a difference: still around 24tps.
What *does* make a difference is changing io_worker_queue_size. A lower
value of 16 effectively starves the workers of work to do, and I get a
speedup to about 28tps. A higher value of 512 gives the workers more
chance to issue the IOs -- and more responsibility to complete them --
and it drops to 17tps. Furthermore, while the test is running, the io
workers are constantly at 100% (mostly verifying checksums) and the
backends are at 50% (20% when io_worker_queue_size=512).
As an aside, I'm building with meson using -Dc_args="-msse4.2 -Wtype-
limits -Werror=missing-braces". But I notice that the meson build
doesn't seem to use -funroll-loops or -ftree-vectorize when building
checksums.c. Is that intentional? If not, perhaps slower checksum
calculations explain my results.
Regards,
Jeff Davis
Attachment | Content-Type | Size |
---|---|---|
test-aio.patch | text/x-patch | 5.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Mikhail Kot | 2025-09-05 20:46:55 | Re: 回复: Fix segfault while accessing half-initialized hash table in pgstat_shmem.c |
Previous Message | Robert Haas | 2025-09-05 20:19:12 | Re: RFC: extensible planner state |