Quick Links

Re: Don't synchronously wait for already-in-progress IO in read stream

From:	Melanie Plageman <melanieplageman(at)gmail(dot)com>
To:	Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Cc:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>, Tomas Vondra <tv(at)fuzzy(dot)cz>
Subject:	Re: Don't synchronously wait for already-in-progress IO in read stream
Date:	2026-03-16 21:45:07
Message-ID:	CAAKRu_YV0D_9wWgGkHdWgQsooqmcc08Uqb3tqxzrwcDXHajDtA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Thanks for the continued review!

Attached v5 adds some comments to the tests, fixes a few nits in the
actual code, and adds a commit to fix what I think is an existing
off-by-one error in TRACE_POSTGRESQL_BUFFER_READ_DONE.

On Fri, Mar 6, 2026 at 8:18 AM Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> wrote:
>
> > For test_repeated_blocks, the first test:
> >
> > # test miss of the same block twice in a row
> > $psql->query_safe(
> > qq/
> > SELECT evict_rel('largeish');
> > /);
> > $psql->query_safe(
> > qq/
> > SELECT * FROM read_stream_for_blocks('largeish', ARRAY[0, 2, 2, 4, 4]);
> > /);
> > ok(1, "$io_method: stream missing the same block repeatedly");
> >
> > It says that it will miss the same block repeatedly, is that because
> > we won't start a read for any of the blocks until after
> > read_stream_get_block has returned all of them? If so, could be
> > clearer in the comment. Not everyone understands all the read stream
> > internals.
>
> I think we start a read of blocks because we hit stream->distance but
> it doesn't affect any consecutive same block numbers. What I
> understood is:
>
> Since io_combine_limit is 1, there won't be any IO combining.
>
> 0th block (0), miss, distance is 1; StartReadBuffersImpl() and
> WaitReadBuffers() are called for 0th block.
> 1th block (2), miss, distance is 2, StartReadBuffersImpl() is called.
> 2th block (2), miss, distance is 2, StartReadBuffersImpl() and
> WaitReadBuffers() are called 1th block.
> 3th block (4), miss, distance is 4, StartReadBuffersImpl() is called.
> 4th block (4), miss, distance is 4, StartReadBuffersImpl() and
> WaitReadBuffers() are called 2, 3 and 4th blocks.

Makes sense. I've tried to add a comment to this effect.

> > I know a lot of other tests do this, but I find it so hard to read the
> > test with the SQL is totally left-aligned like that -- especially with
> > comments interspersed. You can easily flow the queries on multiple
> > lines and indent it more.
>
> I agree with you.

I did reflow the SQL. It does mean there will be a bunch of extra
whitespace sent to the server. Other tests do this, though. I wonder
how much it affects performance...

> > For test_repeated_blocks, the second test:
> >
> > # test hit of the same block twice in a row
> > $psql->query_safe(
> > qq/
> > SELECT evict_rel('largeish');
> > /);
> > $psql->query_safe(
> > qq/
> > SELECT * FROM read_stream_for_blocks('largeish', ARRAY[0, 1, 2, 3, 4,
> > 5, 6, 5, 4, 3, 2, 1, 0]);
> > /);
> > ok(1, "$io_method: stream accessing same block");
> >
> > I assume that the second access of 2 is a hit because we actually did
> > IO for the first one (unlike in the earlier case)?
>
> I think so but to clarify, all second access of [2, 1, 0] blocks are hit; right?

Yes. I tried expanding the comment to elaborate, but it just came out
awkward, so I left it the way it is.

> > For test_inject_foreign, the 3rd test:
> >
> > # Test read stream encountering two buffers that are undergoing the same
> > # IO, started by another backend
> >
> > I see that psql_b is requesting 3 blocks which can be combined into 1
> > IO, which makes it different than the 1st foreign IO test case:
> >
> > ###
> > # Test read stream encountering buffers undergoing IO in another backend,
> > # with the other backend's reads succeeding.
> > ###
> >
> > where psql_b only requests 1 but I don't really see how these are
> > covering different code. Maybe if the read stream one (psql_a) is
> > doing a combined IO it might exercise slightly different code, but
> > otherwise I don't get it.
>
> I think the main difference is that:
>
> > ###
> > # Test read stream encountering buffers undergoing IO in another backend,
> > # with the other backend's reads succeeding.
> > ###
>
> SELECT array_agg(blocknum) FROM read_stream_for_blocks('largeish',
> ARRAY[0, 2, 5, 7]);
>
> We need to join waiting block number 5 and then start another IO for
> block number 7.
>
> > # Test read stream encountering two buffers that are undergoing the same
> > # IO, started by another backend
>
> SELECT array_agg(blocknum) FROM read_stream_for_blocks('largeish',
> ARRAY[0, 2, 4]);
>
> We need to join waiting block number 2 but after waiting for an IO, IO
> for block number 4 should be already completed too. We don't need to
> start IO like the other case.

Ah, makes sense. Thanks!

- Melanie

Attachment	Content-Type	Size
v5-0001-aio-Refactor-tests-in-preparation-for-more-tests.patch	text/x-patch	10.8 KB
v5-0002-test_aio-Add-read_stream-test-infrastructure-test.patch	text/x-patch	23.1 KB
v5-0003-Fix-off-by-one-error-in-read-IO-tracing.patch	text/x-patch	1.1 KB
v5-0004-Make-buffer-hit-helper.patch	text/x-patch	6.0 KB
v5-0005-Don-t-wait-for-already-in-progress-IO.patch	text/x-patch	21.8 KB

In response to

Re: Don't synchronously wait for already-in-progress IO in read stream at 2026-03-06 13:18:13 from Nazir Bilal Yavuz

Responses

Re: Don't synchronously wait for already-in-progress IO in read stream at 2026-03-17 17:26:02 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2026-03-16 21:49:17	Re: Add starelid, attnum to pg_stats and leverage this in pg_dump
Previous Message	Matthias van de Meent	2026-03-16 21:45:01	Re: Adding REPACK [concurrently]