| From: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
|---|---|
| To: | Andres Freund <andres(at)anarazel(dot)de> |
| Cc: | pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Tomas Vondra <tv(at)fuzzy(dot)cz>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> |
| Subject: | Re: AIO / read stream heuristics adjustments for index prefetching |
| Date: | 2026-03-31 19:31:39 |
| Message-ID: | CAAKRu_YvSjYvKtsTsRqm0=Jy0QhpUTsccRd66BsSKrdN-7ZA1g@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Mar 31, 2026 at 12:02 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> 0001+0002: Return whether WaitReadBuffers() needed to wait
>
> The first patch allows pgaio_wref_check_done() to work more reliably with
> io_uring. Until now it only was able to return true if userspace already
> had consumed the kernel's completion event, but returned false otherwise.
> That's not really incorrect, just suboptimal.
>
> The second patch returns whether WaitReadBuffers() needed to wait for
> IO. This is useful for a) instrumentation like in [2] and b) to provide
> information to the read_stream heuristics to control how aggressive to
> perform read ahead.
These both look good to me except in 0001 you left an XXX in
pgaio_wref_check_done() that I think is the very thing that commit
does.
> 0003: read_stream: Issue IO synchronously while in fast path
LGTM
> 0004: read_stream: Prevent distance from decaying too quickly
>
> There are two minor questions here:
> - Should read_stream_pause()/read_stream_resume() restore the "holdoff"
> counter? I doubt it matters for the prospective user, since it will
> only be used when the lookahead distance is very large.
I don't really understand this. We have to do this with distance
because we set it to 0 and use distance == 0 to indicate stream end.
read_stream_pause() doesn't set the distance_decay_holoff to 0. If you
mean, should we reset holdoff to its initial value, then I don't think
so. I imagine that users doing a lot of pause and resume may not have
high distance.
> - For how long to hold off distance reductions? Initially I was torn
> between using "max_pinned_buffers" (Min(max_ios * io_combine_limit,
> cap)) and "max_ios" ([maintenance_]effective_io_concurrency). But I
> think the former makes more sense, as we otherwise won't allow for far
> enough readahead when doing IO combining, and it does seem to make sense
> to hold off decay for long enough that the maximum lookahead could not
> theoretically allow us to start an IO.
I agree. 0004 LGTM otherwise.
- Melanie
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Masahiko Sawada | 2026-03-31 19:40:14 | Re: Initial COPY of Logical Replication is too slow |
| Previous Message | Masahiko Sawada | 2026-03-31 19:29:37 | Re: Initial COPY of Logical Replication is too slow |