Re: Don't synchronously wait for already-in-progress IO in read stream

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>, Tomas Vondra <tv(at)fuzzy(dot)cz>
Subject: Re: Don't synchronously wait for already-in-progress IO in read stream
Date: 2026-03-26 21:43:17
Message-ID: CAAKRu_a8htJqvgMAMtA54zX35EZKhx3j7nLDXhksbciCxiNaJQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 25, 2026 at 5:58 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> I'm planning to commit 0001 soon - it hasn't changed in a while. Then I'd like
> to get 0002 committed soon after, but I'll hold off for that until tomorrow,
> given that nobody has looked at it (as it's new). I think 0004-0007 can be
> committed too, but I am not sure if you (Melanie) want to do so.

This is a review of 0002

in read_buffers():
- you forgot to initialize operation->forknum
- I think the output columns could really use a bit of documentation
- I assume that using uint16 for nblocks and nios is to make sure it
plays nice with the input nblocks which is an int32 -- it doesn't
matter for the range of values you're using, but it would be better if
you could just use uint32 everywhere but I guess because we only have
int4 in SQL you can't.
anyway, I find all the many different types of ints and uints in
read_buffers() pretty sus. Like why do you need this cast to int16?
that seems...wrong and unnecessary?
values[3] = Int32GetDatum((int16) nblocks_this_io);

in the evict_rel() refactor:
- you need invalidate_one_block() to use the forknum parameter because
otherwise temp tables won't evict any forks except the main fork

for the tests themselves:
- for the first test,
# check that one larger read is done as multiple reads
isn't the comment actually the opposite of what it is testing?

0|0|t|2 -- would be 1 2 block io starting at 0, no?

seems like something like
# check that consecutive misses are combined into one read
would be better

- for this comment:
# but if we do it again, i.e. it's in s_b, there will be two operations
technically you are also doing this for temp tables, so the comment
isn't entirely correct.

- For this test:
# Verify that we aren't doing reads larger than io_combine_limit
isn't this more just covering the logic in read_buffers()? AFAICT
StartReadBuffers() only worries about the max IOs it can combine if it
is near the segment boundary

- For this:
$psql_a->query_safe(qq|SELECT invalidate_rel_block('$table', 1)|);
$psql_a->query_safe(qq|SELECT invalidate_rel_block('$table', 2)|);
$psql_a->query_safe(qq|SELECT * FROM read_buffers('$table', 3, 2)|);
psql_like(
$io_method,
$psql_a,
"$persistency: read buffers, miss 1-2, hit 3-4",
qq|SELECT blockoff, blocknum, needs_io, nblocks FROM
read_buffers('$table', 1, 4)|,
qr/^0\|1\|t\|2\n2\|3\|f\|1\n3\|4\|f\|1$/,
qr/^$/);

I think this is a duplicate. There is one before and after the "verify
we aren't doing reads larger than io_combine_limit"

- It may be worth adding one more test case which is IO in progress on
the last block since you have in-progress as the first and the middle
blocks but not as the last block

# Test in-progress IO on the last block of the range
$psql_a->query_safe(qq|SELECT evict_rel('$table')|);
$psql_a->query_safe(
qq|SELECT read_rel_block_ll('$table', 3, wait_complete=>false)|);
psql_like(
$io_method,
$psql_a,
"$persistency: read buffers, in-progress 3, read 1-3",
qq|SELECT blockoff, blocknum, needs_io, nblocks FROM
read_buffers('$table', 1, 3)|,
qr/^0\|1\|t\|2\n2\|3\|f\|1$/,
qr/^$/);

- Melanie

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2026-03-26 21:47:05 Re: doc: create table improvements
Previous Message David Rowley 2026-03-26 21:29:47 Re: another autovacuum scheduling thread