Re: Don't synchronously wait for already-in-progress IO in read stream

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>, Tomas Vondra <tv(at)fuzzy(dot)cz>
Subject: Re: Don't synchronously wait for already-in-progress IO in read stream
Date: 2026-03-03 19:47:13
Message-ID: CAAKRu_ZM1epxTdt2=4-g4ff6UC09zne+0xg_gNv3d7LEcxEvNA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the review!

On Thu, Feb 5, 2026 at 11:56 AM Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> wrote:
>
> On Sat, 24 Jan 2026 at 00:04, Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
>
> > - In the 004_read_stream tests, I wonder if there is a way to test
> > that we don't wait for foreign IO until WaitReadBuffers(). We have
> > tests for the stream accessing the same block, which in some cases
> > will exercise the foreign IO path. But it doesn't distinguish between
> > the old behavior -- waiting for the IO to complete when starting read
> > IO on it -- and the new behavior -- not waiting until
> > WaitReadBuffers(). That may not be possible to test, though.
>
> Won't 'stream accessing the same block test' almost always test the
> new behavior (not waiting until WaitReadBuffers())? Having dedicated
> tests for both cases would be helpful, though.

Yea, I was thinking something like testing that if session A is
blocked completing read of block 2 and session B is requesting blocks
2-4 that buffers containing blocks 3 and 4 are valid when session B is
waiting on block 2 to finish.

I started working on something but it needed some new infrastructure
to check if the buffer is valid, and I wanted to see what others
thought first.

I did finally review Andres' test patches and have included my review
feedback here as well.

"aio: Refactor tests in preparation for more tests" (v4-0001) looks
good to me as well. I included one tiny idea AI suggested to me in a
follow-on patch (v4-0002).

> diff --git a/src/test/modules/test_aio/t/004_read_stream.pl
> b/src/test/modules/test_aio/t/004_read_stream.pl
> +foreach my $method (TestAio::supported_io_methods())
> +{
> + $node->adjust_conf('postgresql.conf', 'io_method', 'worker');
> + $node->start();
> + test_io_method($method, $node);
> + $node->stop();
> +}
>
> This seems wrong, we always test io_method=worker. I think we need to
> replace 'worker' with the $method. Also, we can add check below to the
> test_io_method function in the 004_read_stream.pl:
> ```
> is($node->safe_psql('postgres', 'SHOW io_method'),
> $io_method, "$io_method: io_method set correctly");

Good catch. Fixed. I also found a few other small things in this patch
(v4-0003) which I put in v4-0004.

Some ideas I had that I didn't include in v4-0003 because its Andres
patch and is subjective:

For test_repeated_blocks, the first test:

# test miss of the same block twice in a row
$psql->query_safe(
qq/
SELECT evict_rel('largeish');
/);
$psql->query_safe(
qq/
SELECT * FROM read_stream_for_blocks('largeish', ARRAY[0, 2, 2, 4, 4]);
/);
ok(1, "$io_method: stream missing the same block repeatedly");

It says that it will miss the same block repeatedly, is that because
we won't start a read for any of the blocks until after
read_stream_get_block has returned all of them? If so, could be
clearer in the comment. Not everyone understands all the read stream
internals.

I know a lot of other tests do this, but I find it so hard to read the
test with the SQL is totally left-aligned like that -- especially with
comments interspersed. You can easily flow the queries on multiple
lines and indent it more.

For test_repeated_blocks, the second test:

# test hit of the same block twice in a row
$psql->query_safe(
qq/
SELECT evict_rel('largeish');
/);
$psql->query_safe(
qq/
SELECT * FROM read_stream_for_blocks('largeish', ARRAY[0, 1, 2, 3, 4,
5, 6, 5, 4, 3, 2, 1, 0]);
/);
ok(1, "$io_method: stream accessing same block");

I assume that the second access of 2 is a hit because we actually did
IO for the first one (unlike in the earlier case)?

For test_inject_foreign:

In general, I am not ramped up enough on injection point stuff to know
if the actual new injection point stuff you added in test_aio.c is is
correct and optimal, but I did review the read stream additions to
test_aio.c and the tests added to 004_read_stream.pl.

For test_inject_foreign, the 3rd test:

# Test read stream encountering two buffers that are undergoing the same
# IO, started by another backend

I see that psql_b is requesting 3 blocks which can be combined into 1
IO, which makes it different than the 1st foreign IO test case:

###
# Test read stream encountering buffers undergoing IO in another backend,
# with the other backend's reads succeeding.
###

where psql_b only requests 1 but I don't really see how these are
covering different code. Maybe if the read stream one (psql_a) is
doing a combined IO it might exercise slightly different code, but
otherwise I don't get it.

> Nitpick but I prefer something like TrackBufferHit() or
> CountBufferHit() as a function name instead of ProcessBufferHit().
> ProcessBufferHit() gives the impression that the function is doing a
> job more than it currently does. Other than that, 0004 LGTM.

I changed this in attached v4.

- Melanie

Attachment Content-Type Size
v4-0001-aio-Refactor-tests-in-preparation-for-more-tests.patch text/x-patch 10.8 KB
v4-0002-small-optimization-for-test-refactor.patch text/x-patch 1.1 KB
v4-0003-test_aio-Add-read_stream-test-infrastructure-test.patch text/x-patch 22.9 KB
v4-0004-test-fixes.patch text/x-patch 2.5 KB
v4-0005-Make-buffer-hit-helper.patch text/x-patch 5.9 KB
v4-0006-Don-t-wait-for-already-in-progress-IO.patch text/x-patch 20.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2026-03-03 19:47:19 Re: postgres_fdw: Use COPY to speed up batch inserts
Previous Message David G. Johnston 2026-03-03 19:28:16 Re: pg_plan_advice (now with transparent SQL plan performance overrides - pg_stash_advice)