Re: index prefetching

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: index prefetching
Date: 2025-07-21 12:39:48
Message-ID: CA+hUKGL6hCd40Dh1AcFcoiw5zJXK2T1dRKO3oe8RkPExqA5zoQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jul 20, 2025 at 1:07 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> On Sat, Jul 19, 2025 at 11:23 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
> > Thanks for the link. It seems I came up with an almost the same patch,
> > with three minor differences:
> >
> > 1) There's another place that sets "distance = 0" in
> > read_stream_next_buffer, so maybe this should preserve the distance too?
> >
> > 2) I suspect we need to preserve the distance at the beginning of
> > read_stream_reset, like
> >
> > stream->reset_distance = Max(stream->reset_distance,
> > stream->distance);
> >
> > because what if you call _reset before reaching the end of the stream?
> >
> > 3) Shouldn't it reset the reset_distance to 0 after restoring it?
>
> Probably. Hmm... an earlier version of this code didn't use distance
> == 0 to indicate end-of-stream, but instead had a separate internal
> end_of_stream flag. If we brought that back and didn't clobber
> distance, we wouldn't need this save-and-restore dance. It seemed
> shorter and sweeter without it back then, before _reset() existed in
> its present form, but I wonder if end_of_stream would be nicer than
> having to add this kind of stuff, without measurable downsides.

...

> Good question. Yeah, your flag idea seems like a good way to avoid
> baking opinion into this level. I wonder if it should be a bitmask
> rather than a boolean, in case we think of more things that need to be
> included or not when resetting.

Here's a sketch of the above two ideas for discussion (.txt to stay
off cfbot's radar for this thread). Better than save/restore?

Here also are some alternative experimental patches for preserving
accumulated look-ahead distance better in cases like that. Needs more
exploration... thoughts/ideas welcome...

Attachment Content-Type Size
0001-aio-Refactor-read_stream.c-end-of-stream-state.txt text/plain 4.4 KB
0002-aio-Add-READ_STREAM_RESET_CONTINUE-flag.txt text/plain 6.3 KB
0003-aio-Improve-read_stream.c-look-ahead-heuristics-A.txt text/plain 3.2 KB
0003-aio-Improve-read_stream.c-look-ahead-heuristics-B.txt text/plain 4.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matheus Alcantara 2025-07-21 12:47:04 Proposal: QUALIFY clause
Previous Message Nazir Bilal Yavuz 2025-07-21 12:21:35 Re: Improve error reporting in 027_stream_regress test