Re: index prefetching

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: index prefetching
Date: 2025-08-28 23:52:29
Message-ID: 931afce3-8c86-4c96-9861-0ffa17c6560f@vondra.me
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8/29/25 01:27, Andres Freund wrote:
> Hi,
>
> On 2025-08-29 01:00:58 +0200, Tomas Vondra wrote:
>> I'm not sure how to determine what concurrency it "wants". All I know is
>> that for "warm" runs [1], the basic index prefetch patch uses distance
>> ~2.0 on average, and is ~2x slower than master. And with the patches the
>> distance is ~270, and it's 30% slower than master. (IIRC there's about
>> 30% misses, so 270 is fairly high. Can't check now, the machine is
>> running other tests.)
>
> There got to be something wrong here, I don't see a reason why at any
> meaningful distance it'd be slower.
>
> What set of patches do I need to repro the issue?
>

Use this branch:

https://github.com/tvondra/postgres/commits/index-prefetch-master/

and then Thomas' patch that increases the prefetch distance:

https://www.postgresql.org/message-id/CA%2BhUKGL2PhFyDoqrHefqasOnaXhSg48t1phs3VM8BAdrZqKZkw%40mail.gmail.com

(IIRC there's a trivial conflict in read_stream_reset.).

> And what are the complete set of pieces to load the data?
> https://postgr.es/m/293a4735-79a4-499c-9a36-870ee9286281%40vondra.me
> has the query, but afaict not enough information to infer init.sql
>

Yeah, I forgot to include that piece, sorry. Here's an init.sql, that
loads the table, it also has the query.

>
>> Not sure about wait events, but I don't think any backends are doing
>> sychnronous I/O. There's only that one query running, and it's using AIO
>> (except for the index, which is still read synchronously).
>>
>> Likewise, I don't think there's insufficient number of workers. I've
>> tried with 3 and 12 workers, and there's virtually no difference between
>> those. IIRC when watching "top", I've never seen more than 1 or maybe 2
>> workers active (using CPU).
>
> That doesn't say much - if the they are doing IO, they're not on CPU...
>

True. But one worker did show up in top, using a fair amount of CPU, so
why wouldn't the others (if they process the same stream)?

regards

--
Tomas Vondra

Attachment Content-Type Size
repro.sql application/sql 653 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2025-08-28 23:57:17 Re: index prefetching
Previous Message Masahiko Sawada 2025-08-28 23:41:21 Re: doc patch: missing tags in protocol.sgml