Re: index prefetching

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: index prefetching
Date: 2025-08-12 19:48:50
Message-ID: x3b5pjpttpwz74fpr5zw7avhjmiti3us5g57f2jizabrv23e57@lmo6yiuxnnjj
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2025-08-12 18:53:13 +0200, Tomas Vondra wrote:
> I'm running some tests looking for these weird changes, not just with
> the patches, but on master too. And I don't think b4212231 changed the
> situation very much.
>
> FWIW this issue is not caused by the index prefetching patches, I can
> reproduce it with master (on b227b0bb4e032e19b3679bedac820eba3ac0d1cf
> from yesterday). So maybe we should split this into a separate thread.
>
> Consider for example the dataset built by create.sql - it's randomly
> generated, but the idea is that it's correlated, but not perfectly. The
> table is ~3.7GB, and it's a cold run - caches dropped + restart).
>
> Anyway, a simple range query look like this:
>
> EXPLAIN (ANALYZE, COSTS OFF)
> SELECT * FROM t WHERE a BETWEEN 16336 AND 49103 ORDER BY a ASC;
>
> QUERY PLAN
> ------------------------------------------------------------------------
> Index Scan using idx on t
> (actual time=0.584..433.208 rows=1048576.00 loops=1)
> Index Cond: ((a >= 16336) AND (a <= 49103))
> Index Searches: 1
> Buffers: shared hit=7435 read=50872
> I/O Timings: shared read=332.270
> Planning:
> Buffers: shared hit=78 read=23
> I/O Timings: shared read=2.254
> Planning Time: 3.364 ms
> Execution Time: 463.516 ms
> (10 rows)
>
> EXPLAIN (ANALYZE, COSTS OFF)
> SELECT * FROM t WHERE a BETWEEN 16336 AND 49103 ORDER BY a DESC;
>
> QUERY PLAN
> ------------------------------------------------------------------------
> Index Scan Backward using idx on t
> (actual time=0.566..22002.780 rows=1048576.00 loops=1)
> Index Cond: ((a >= 16336) AND (a <= 49103))
> Index Searches: 1
> Buffers: shared hit=36131 read=50872
> I/O Timings: shared read=21217.995
> Planning:
> Buffers: shared hit=82 read=23
> I/O Timings: shared read=2.375
> Planning Time: 3.478 ms
> Execution Time: 22231.755 ms
> (10 rows)
>
> That's a pretty massive difference ... this is on my laptop, and the
> timing changes quite a bit, but it's always a multiple of the first
> query with forward scan.

I suspect what you're mainly seeing here is that the OS can do readahead for
us for forward scans, but not for backward scans. Indeed, if I look at
iostat, the forward scan shows:

Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz w/s wMB/s wrqm/s %wrqm w_await wareq-sz d/s dMB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util
nvme6n1 3352.00 400.89 0.00 0.00 0.18 122.47 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.62 47.90

whereas the backward scan shows:

Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz w/s wMB/s wrqm/s %wrqm w_await wareq-sz d/s dMB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util
nvme6n1 10958.00 85.57 0.00 0.00 0.06 8.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.69 63.80

Note the different read sizes...

> I did look into pg_aios, but there's only 8kB requests in both cases. I
> didn't have time to look closer yet.

That's what we'd expect, right? There's nothing on master that'd perform read
combining for index scans...

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2025-08-12 20:07:02 Re: `pg_ctl init` crashes when run concurrently; semget(2) suspected
Previous Message Peter Eisentraut 2025-08-12 19:41:47 Re: GB18030-2022 Support in PostgreSQL