From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Tomas Vondra <tomas(at)vondra(dot)me> |
Cc: | Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
Subject: | Re: index prefetching |
Date: | 2025-08-12 19:48:50 |
Message-ID: | x3b5pjpttpwz74fpr5zw7avhjmiti3us5g57f2jizabrv23e57@lmo6yiuxnnjj |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2025-08-12 18:53:13 +0200, Tomas Vondra wrote:
> I'm running some tests looking for these weird changes, not just with
> the patches, but on master too. And I don't think b4212231 changed the
> situation very much.
>
> FWIW this issue is not caused by the index prefetching patches, I can
> reproduce it with master (on b227b0bb4e032e19b3679bedac820eba3ac0d1cf
> from yesterday). So maybe we should split this into a separate thread.
>
> Consider for example the dataset built by create.sql - it's randomly
> generated, but the idea is that it's correlated, but not perfectly. The
> table is ~3.7GB, and it's a cold run - caches dropped + restart).
>
> Anyway, a simple range query look like this:
>
> EXPLAIN (ANALYZE, COSTS OFF)
> SELECT * FROM t WHERE a BETWEEN 16336 AND 49103 ORDER BY a ASC;
>
> QUERY PLAN
> ------------------------------------------------------------------------
> Index Scan using idx on t
> (actual time=0.584..433.208 rows=1048576.00 loops=1)
> Index Cond: ((a >= 16336) AND (a <= 49103))
> Index Searches: 1
> Buffers: shared hit=7435 read=50872
> I/O Timings: shared read=332.270
> Planning:
> Buffers: shared hit=78 read=23
> I/O Timings: shared read=2.254
> Planning Time: 3.364 ms
> Execution Time: 463.516 ms
> (10 rows)
>
> EXPLAIN (ANALYZE, COSTS OFF)
> SELECT * FROM t WHERE a BETWEEN 16336 AND 49103 ORDER BY a DESC;
>
> QUERY PLAN
> ------------------------------------------------------------------------
> Index Scan Backward using idx on t
> (actual time=0.566..22002.780 rows=1048576.00 loops=1)
> Index Cond: ((a >= 16336) AND (a <= 49103))
> Index Searches: 1
> Buffers: shared hit=36131 read=50872
> I/O Timings: shared read=21217.995
> Planning:
> Buffers: shared hit=82 read=23
> I/O Timings: shared read=2.375
> Planning Time: 3.478 ms
> Execution Time: 22231.755 ms
> (10 rows)
>
> That's a pretty massive difference ... this is on my laptop, and the
> timing changes quite a bit, but it's always a multiple of the first
> query with forward scan.
I suspect what you're mainly seeing here is that the OS can do readahead for
us for forward scans, but not for backward scans. Indeed, if I look at
iostat, the forward scan shows:
Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz w/s wMB/s wrqm/s %wrqm w_await wareq-sz d/s dMB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util
nvme6n1 3352.00 400.89 0.00 0.00 0.18 122.47 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.62 47.90
whereas the backward scan shows:
Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz w/s wMB/s wrqm/s %wrqm w_await wareq-sz d/s dMB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util
nvme6n1 10958.00 85.57 0.00 0.00 0.06 8.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.69 63.80
Note the different read sizes...
> I did look into pg_aios, but there's only 8kB requests in both cases. I
> didn't have time to look closer yet.
That's what we'd expect, right? There's nothing on master that'd perform read
combining for index scans...
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2025-08-12 20:07:02 | Re: `pg_ctl init` crashes when run concurrently; semget(2) suspected |
Previous Message | Peter Eisentraut | 2025-08-12 19:41:47 | Re: GB18030-2022 Support in PostgreSQL |