Re: Don't synchronously wait for already-in-progress IO in read stream

From: Andres Freund <andres(at)anarazel(dot)de>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>, Tomas Vondra <tv(at)fuzzy(dot)cz>
Subject: Re: Don't synchronously wait for already-in-progress IO in read stream
Date: 2026-03-19 22:21:59
Message-ID: 42rdu4q44kvsq53fz5qgzuawqpaytvnemsnquynlfch5mqhc2m@6ytnlgivtzro
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2026-03-18 16:16:14 -0400, Andres Freund wrote:
> > I don't think we'll be able to easily test READ_BUFFER_ALREADY_DONE
> > (though perhaps we aren't testing it for shared buffers either?).
>
> We do reach the READ_BUFFER_ALREADY_DONE in PrepareHeadBufferReadIO(), but
> only due to io_method=sync peculiarities (as that only actually performs the
> IO later when waiting, it's easy to have two IOs for the same block).
>
>
> It's probably worth adding tests for that, although I suspect it should be in
> 001_aio.pl - no read stream required to hit it. I can give it a shot, if you
> want?

I started writing some tests and realized that these commits had made that
somewhat harder - by not using StartBufferIO() anymore, the existing test
infrastructure for StartBufferIO() (in test_aio) became less useful. In
effect it actually reduced coverage some.

Thinking about it more, I also got worried about the duplicating of
logic. It's perhaps acceptable with the patches as-is, but we'll soon need
something very similar for AIO writes. Then we'd end up with like 5 variants,
because we'd still need the existing StartBufferIO() for some cases where we
do want to wait (e.g. the edge case in ExtendBufferedRelShared()).

In the attached prototype I replaced your patch introducing
PrepareHeadBufferReadIO()/ PrepareAdditionalBufferReadIO() with one that
instead revises StartBufferIO() to have the enum return value you introduced
and a PgAioWaitRef * argument that callers that would like to asynchronously
wait for the IO to complete (others pass NULL). There are some other cleanups
in it too, see the commit message for more details.

Your final patch doesn't change a whole lot due to this, instead of calling
PrepareHeadBufferReadIO() it now calls StartBufferIO(wait=true,
&operation->io_wref) and instead of PrepareAdditionalBufferReadIO() it does
StartBufferIO(wait=false, NULL). That means it has to set
operation->foreign_io to true itself, but that seems not problematic.

As part of this I replaced the following comment:
+ /*
+ * Submit any staged IO before checking for in-progress IO. Without this,
+ * the wref check below could find IO that this backend staged but hasn't
+ * submitted yet. Waiting on that would PANIC because the owner can't wait
+ * on its own staged IO.
+ */
+ pgaio_submit_staged();

as I am pretty sure that can't be reached. I added an assert + explanation.

I also updated "Restructure AsyncReadBuffers()" to move
pgstat_prepare_report_checksum_failure() and the computation of flags to
before the ReadBuffersCanStartIO(). And added a comment explaining why little
should be added between the ReadBuffersCanStartIO() calls.

Thoughts?

I still want to expand the tests a bit, but I thought that we should resolve
this structural issue first.

Greetings,

Andres Freund

Attachment Content-Type Size
v7a-0001-aio-Refactor-tests-in-preparation-for-more-tests.patch text/x-diff 10.8 KB
v7a-0002-test_aio-Add-read_stream-test-infrastructure-tes.patch text/x-diff 24.1 KB
v7a-0003-Fix-off-by-one-error-in-read-IO-tracing.patch text/x-diff 2.4 KB
v7a-0004-Pass-io_object-and-io_context-through-to-PinBuff.patch text/x-diff 3.4 KB
v7a-0005-Make-buffer-hit-helper.patch text/x-diff 5.0 KB
v7a-0006-Restructure-AsyncReadBuffers.patch text/x-diff 7.6 KB
v7a-0007-bufmgr-Improve-StartBufferIO-interface.patch text/x-diff 27.5 KB
v7a-0008-AIO-Don-t-wait-for-already-in-progress-IO.patch text/x-diff 7.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2026-03-19 22:43:48 Re: pg_plan_advice
Previous Message Robert Haas 2026-03-19 22:19:17 Re: pg_plan_advice