Re: index prefetching

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: index prefetching
Date: 2025-08-14 01:51:43
Message-ID: CAH2-Wzndw4WFgRgxWeagYt1ytMxBY1ZFyRTwoZozF0VXj6=XJA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 13, 2025 at 8:59 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
> I investigated this from a different angle, by tracing the I/O request
> generated. using perf-trace. And the patterns are massively different.

I tried a similar approach myself, using a variety of tools. That
didn't get me very far.

> So, Q1 ASC gets to combine the I/O into nice large chunks. But the DESC
> queries end up doing a stream of 8K requests. The Q2 ASC gets to do 16KB
> reads in about half the cases, but the rest is still 8KB.

My randomized version of the forwards scan is about as fast (maybe
even slightly faster) than your original version on my workstation, in
spite of the fact that EXPLAIN ANALYZE reports that the randomized
version does indeed have about a 3x higher "I/O Timings: shared read".
So I tend to doubt that low-level instrumentation will be all that
helpful with debugging the issue.

I suppose that it *might* be helpful if you can use it to spot some
kind of pattern -- a pattern that hints at the real underlying issue.
To me the issue feels like a priority inversion problem. Maybe
slow-ish I/O can lead to very very slow query execution time, due to
some kind of second order effect (possibly an issue on the read stream
side). If that's what this is then the problem still won't be that
there was slow-ish I/O, or that we couldn't successfully combine I/Os
in whatever way. After all, we surely won't be able to combine I/Os
with the randomized version of the queries that I described to the
list this evening -- and yet those are still very fast in terms of
overall execution time (somehow, they are about as fast as the
original variant, that will manage to combine I/Os, in spite of the
obvious disadvantage of requiring random I/O for the heap accesses).

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ajin Cherian 2025-08-14 01:58:26 Re: Improve pg_sync_replication_slots() to wait for primary to advance
Previous Message Peter Smith 2025-08-14 01:44:56 Re: [WIP]Vertical Clustered Index (columnar store extension) - take2