Re: AIO / read stream heuristics adjustments for index prefetching

From: Andres Freund <andres(at)anarazel(dot)de>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Tomas Vondra <tv(at)fuzzy(dot)cz>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Subject: Re: AIO / read stream heuristics adjustments for index prefetching
Date: 2026-04-01 15:49:30
Message-ID: kwzdd2tiow5ai25ehbrsoo6wmiokw5vckjfxle643k6dzskdv6@c2ti7opcwsiv
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2026-04-01 10:52:03 -0400, Melanie Plageman wrote:
> On Tue, Mar 31, 2026 at 12:02 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> >
> > 0008: WIP: read stream: Split decision about look ahead for AIO and combining
> >
> > Until now read stream has used a single look-ahead distance to control
> > lookahead for both IO combining and read-ahead. That's sub-optimal, as we
> > want to do IO combining even when we don't need to do any readahead, as
> > avoiding the syscall overhead is important to reduce CPU overhead when
> > data is in the kernel page cache.
> >
> > This is a prototype for what it could look like to split those
> > decisions. Thereby fixing the regression mentioned in 0006.
>
> I wonder if we need to keep the combine_limit member in the read
> stream. Could we just use io_combine_limit without ramping up and
> down? This is mainly for code complexity reasons.

I thought so at first too, but it unfortunately leads to substantial
regressions with index prefetching, due to reading ahead unnecessarily far in
cases where we really just needed one block.

> Perhaps to allow fast path reentry, we could use distance_decay_holdoff == 0
> and ios_in_progress == 0 instead of combine_distance == 0.

Somewhat orthogonal: I really dislike the code to re-enter fastpath. I've now
broken it a few times without noticing. Especially when using a lower
distance, it's easy for the gating conditions to be fulfilled if
read_stream_look_ahead() decided to not *yet* do look ahead, because there's
still a pinned buffer and the distance is low.

ISTM that it really should only be checked after we did a lookahead and found
it to be a buffer hit.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Matheus Alcantara 2026-04-01 15:50:14 Re: postgres_fdw: Use COPY to speed up batch inserts
Previous Message Daniel Gustafsson 2026-04-01 15:49:15 Re: 'Bad file descriptor: dup2( 1, 2 )' error on MacOS CI tasks