Re: AIO / read stream heuristics adjustments for index prefetching

From: Andres Freund <andres(at)anarazel(dot)de>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Tomas Vondra <tv(at)fuzzy(dot)cz>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Subject: Re: AIO / read stream heuristics adjustments for index prefetching
Date: 2026-04-02 21:30:05
Message-ID: hxnsd6madv7em6gataql6fmjekfc5zthb3kp5w3e3mxjdonplf@spudt4jcmeep
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2026-04-02 17:13:34 -0400, Melanie Plageman wrote:
> On Thu, Apr 2, 2026 at 11:47 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> >
> > > On some level, relying on worker mode overhead feels fragile. If
> > > worker overhead decreases—say, by moving to IO worker threads—we won't
> > > be able to rely on this to keep the distance to an advantageous level.
> >
> > I don't see why lower overhead would prevent this from working?
>
> needed_wait has to be true to increase the readahead distance and for
> io_uring, when data was in the kernel buffer cache, needed_wait is
> false, meaning the distance doesn't increase. Worker mode didn't have
> this problem because of overhead. So needed_wait is true for workers.
> But, now that we will have combine_distance, I guess we don't need to
> rely on workers having overhead.

I think we still do, but that it will continue to work, even if the overhead
is much smaller than today. The workers will complete the IOs only after the
memory copy is finished (duh). Therefore, if the distance is too small to
allow workers to complete the copy, the distance will be increased, due to
needed_wait.

> So we are saying that readahead_distance is completely irrelevant for
> copying from the kernel buffer cache and only combine_distance matters for
> that now, right?

I don't think so! The combine_distance thing is crucial to allow for IO
combining, and, indirectly, for triggering the size based "async" heuristics
with io_uring. Once the io_uring async heuristic is triggered, the needed_wait
mechanism works to further increase the distance.

That does mean that for random BLCKSZ sized IOs that are in the page cache the
async mechanism won't typically be triggered - but from what I can tell that's
ok, because lots of 8kB IOs is also where the dispatch overhead to the kernel
threads doing the copying is the highest.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Matheus Alcantara 2026-04-02 21:41:35 Re: Add custom EXPLAIN options support to auto_explain
Previous Message Jacob Champion 2026-04-02 21:26:54 Re: Custom oauth validator options