Re: Don't synchronously wait for already-in-progress IO in read stream

From: Andres Freund <andres(at)anarazel(dot)de>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>, Tomas Vondra <tv(at)fuzzy(dot)cz>
Subject: Re: Don't synchronously wait for already-in-progress IO in read stream
Date: 2026-03-28 00:01:49
Message-ID: bhbitjpkaosi75kxqkemt7257rbz67j5vqjp4ya73uwl5zonnx@zz6z4vzqd7zh
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2026-03-27 18:59:36 -0400, Melanie Plageman wrote:
> Review of 0004:
>
> I noticed I use the word "backend" _a lot_ in this commit message.
> Here is an attempt at using it less:

Mostly adopted.

> Code and test stuff:
>
> in query_wait_block()
>
> - $node->poll_query_until('postgres',
> - qq(SELECT wait_event FROM pg_stat_activity WHERE pid = $pid),
> - $waitfor);
> + my $waitquery;
> + if ($wait_current_session)
> + {
> + $waitquery =
> + qq(SELECT wait_event FROM pg_stat_activity WHERE pid = $pid);
> + }
> + else
> + {
> + $waitquery =
> + qq(SELECT wait_event FROM pg_stat_activity WHERE wait_event
> = '$waitfor');
> + }
> +
> + note "polling for completion with $waitquery";
> + $node->poll_query_until('postgres', $waitquery, $waitfor);
>
> I guess you need WHERE wait_event = $waitfor to keep from getting more
> than one row returned and then failing to parse properly.

Right. For some reason poll_query_until() doesn't allow pattern matching, so
there's a pretty hard limit to allowing any variability in output.

> But I got tripped up on this thinking won't poll_query_until() already give
> you that?

Not as far as I can tell.

> I started thinking maybe wait_current_session should be a hash so we
> can pass it with a name and it will make the query_wait_block() call
> sites less inscrutable, but maybe that's over-engineering.

I'm not even seeing how that would work - the problem is that the wait event
might be in an IO worker (in case of io_method=worker) or in the frontend
process (in case of io_method=io_uring).

> # Because no IO wref was assigned, block 2 should not report foreign IO
> pump_until($psql_a->{run}, $psql_a->{timeout}, \$psql_a->{stdout},
> qr/0\|1\|t\|f\|2\n2\|3\|t\|f\|3/);
>
> you mean block 3
>
> # Because no IO wref was assigned, block 2 should not report foreign IO
> pump_until($psql_a->{run}, $psql_a->{timeout}, \$psql_a->{stdout},
>
> should say block 3

Ooops.

> You could change the first test
> # Test if a read buffers encounters AIO in progress by another backend, it
> # recognizes that other IO as a foreign IO.
>
> To have 0 as the foreign IO (which is a slightly different code path
> than non-head blocks being the foreign IO) and then you still
> basically have coverage of a non-head block being a foreign IO in the
> following test case that looks for multiple contiguous blocks being
> foreign IO.

Did that, except I also offset to start reading from 1, in the unlikely case
we have something in the path loosing track of block numbers.

> In test_read_buffers_inject:
> # recognizes that other IO as a foreign IO. This time we encounter the
> # foreign IO multiple times.
>
> I find "foriegn IO multiple times" hard to parse. I prefer something
> like "multiple
> buffers undergoing foreign IO"

Somehow the modified version seemed harder to read to me, so I left it as is.

> # B: Read block 2 and wait for the completion hook to be reached (which could
> # be in B itself or in an IO worker)
>
> should say blocks 2-3

Fixed.

> I wonder if it is also worth testing a failed foreign IO (i.e.
> operation->foreign_io && !(buf_state & BM_VALID) in
> WaitReadBuffers()). I don't know that it is much different than the
> other failed IO_IN_PROGRESS cases you are already testing.

It might be, but it's getting late, and I would like to see this go in, to
unblock the prefetching thread. And it'd be somewhat cumbersome to write,
unfortunately. So I'll forgo that for now.

> Otherwise, I think we're ready to go!

Yay.

Pushed.

Thanks for the collab!

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2026-03-28 00:14:41 Re: Shared hash table allocations
Previous Message Heikki Linnakangas 2026-03-27 23:30:56 Re: Changing the state of data checksums in a running cluster