Quick Links

Re: index prefetching

From:	Peter Geoghegan <pg(at)bowt(dot)ie>
To:	Tomas Vondra <tomas(at)vondra(dot)me>
Cc:	Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject:	Re: index prefetching
Date:	2025-07-24 01:59:58
Message-ID:	CAH2-Wz=Q9kYv5GZEAv6qCuBH78kLqHbUxiNkxay=KnRqboJTiA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Jul 23, 2025 at 12:36 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> * The TPC-C order line table primary key.

I tested this for myself.

Tomas' index-prefetch-simple-master branch:

set max_parallel_workers_per_gather =0;
SELECT pg_buffercache_evict_relation('order_line');
select pg_prewarm('order_line_pkey');

:ea select sum(ol_amount) from order_line where ol_w_id < 10;
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│
QUERY PLAN
│
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Aggregate (cost=264259.55..264259.56 rows=1 width=32) (actual
time=2015.711..2015.712 rows=1.00 loops=1)
│
│ Output: sum(ol_amount)

│
│ Buffers: shared hit=17815 read=33855

│
│ I/O Timings: shared read=1490.918

│
│ -> Index Scan using order_line_pkey on public.order_line
(cost=0.56..257361.93 rows=2759049 width=4) (actual
time=7.936..1768.236 rows=2700116.00 loops=1) │
│ Output: ol_w_id, ol_d_id, ol_o_id, ol_number, ol_i_id,
ol_delivery_d, ol_amount, ol_supply_w_id, ol_quantity, ol_dist_info
│
│ Index Cond: (order_line.ol_w_id < 10)

│
│ Index Searches: 1

│
│ Index Prefetch: true

│
│ Index Distance: 110.7

│
│ Buffers: shared hit=17815 read=33855

│
│ I/O Timings: shared read=1490.918

│
│ Planning Time: 0.049 ms

│
│ Serialization: time=0.003 ms output=1kB format=text

│
│ Execution Time: 2015.731 ms

│
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(15 rows)

Complex patch (same prewarming/eviction are omitted this time):

:ea select sum(ol_amount) from order_line where ol_w_id < 10;
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│
QUERY PLAN
│
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Aggregate (cost=264259.55..264259.56 rows=1 width=32) (actual
time=768.387..768.388 rows=1.00 loops=1)
│
│ Output: sum(ol_amount)

│
│ Buffers: shared hit=17815 read=33855

│
│ I/O Timings: shared read=138.856

│
│ -> Index Scan using order_line_pkey on public.order_line
(cost=0.56..257361.93 rows=2759049 width=4) (actual
time=7.956..493.694 rows=2700116.00 loops=1) │
│ Output: ol_w_id, ol_d_id, ol_o_id, ol_number, ol_i_id,
ol_delivery_d, ol_amount, ol_supply_w_id, ol_quantity, ol_dist_info
│
│ Index Cond: (order_line.ol_w_id < 10)

│
│ Index Searches: 1

│
│ Buffers: shared hit=17815 read=33855

│
│ I/O Timings: shared read=138.856

│
│ Planning Time: 0.043 ms

│
│ Serialization: time=0.003 ms output=1kB format=text

│
│ Execution Time: 768.454 ms

I'm using direct IO in both cases. This can easily be repeated, and is stable.

To be fair, the planner wants to use a parallel index scan for this.
If I allow the scan to be parallel, 5 parallel workers are used. The
simple patch now takes 295.722 ms, while the complex patch takes
301.875 ms. I imagine that that's because the use of parallelism
eliminates the natural advantage that the complex has with this
workload/index -- the scan as a whole is presumably no longer
bottlenecked on physical index characteristics. The parallel workers
can almost behave as 5 independent scans, all kept sufficiently busy,
even without our having to read ahead to later leaf pages.

It's possible that something weird is going on with the prefetch
distance, in the context of parallel scans specifically -- it's not
like we've really tested parallel scans just yet (with either patch).
Even if there is an addressable problem in either patch here, I'd be
surprised if it was the main factor behind the simple patch doing
relatively well when scanning in parallel like this.

--
Peter Geoghegan

In response to

Re: index prefetching at 2025-07-23 16:36:33 from Peter Geoghegan

Responses

Re: index prefetching at 2025-07-24 02:18:55 from Peter Geoghegan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jeff Davis	2025-07-24 02:11:34	Re: Remaining dependency on setlocale()
Previous Message	David Rowley	2025-07-24 01:47:07	Re: Fixing MSVC's inability to detect elog(ERROR) does not return