Quick Links

Re: index prefetching

From:	Tomas Vondra <tomas(at)vondra(dot)me>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Peter Geoghegan <pg(at)bowt(dot)ie>, Alexandre Felipe <o(dot)alexandre(dot)felipe(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject:	Re: index prefetching
Date:	2026-02-18 15:39:30
Message-ID:	2f7fdaa6-2855-4a49-884c-16b91db9a97b@vondra.me
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 2/18/26 05:21, Andres Freund wrote:
> Hi,
>
> On 2026-02-17 22:36:53 +0100, Tomas Vondra wrote:
>> On 2/17/26 21:16, Peter Geoghegan wrote:
>>> On Tue, Feb 17, 2026 at 2:27 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>>>> On 2026-02-17 12:16:23 -0500, Peter Geoghegan wrote:
>>>>> On Mon, Feb 16, 2026 at 11:48 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>>>>> I agree that the current heuristics (which were invented recently) are
>>>>> too conservative. I overfit the heuristics to my current set of
>>>>> adversarial queries, as a stopgap measure.
>>>>
>>>> Are you doing any testing on higher latency storage? I found it to be quite
>>>> valuable to use dm_delay to have a disk with reproducible (i.e. not cloud)
>>>> higher latency (i.e. not just a local SSD).
>>>
>>> I sometimes use dm_delay (with the minimum 1ms delay) when testing,
>>> but don't do so regularly. Just because it's inconvenient to do so
>>> (perhaps not a great reason).
>>>
>>>> Low latency NVMe can reduce the
>>>> penalty of not enough readahead so much that it's hard to spot problems...
>>>
>>> I'll keep that in mind.
>>>
>>
>> So, what counts as "higher latency" in this context? What delays should
>> we consider practical/relevant for testing?
>
> 0.5-4ms is the range I've seen in various clouds across their reasonable
> storage products (i.e. not spinning disks or other ver bulk oriented things).
>
> Unfortunately dm_delay doesn't support < 1ms delays, but it's still much
> better than nothing.
>
> I've been wondering about teaching AIO to delay IOs (by adding a sleep to
> workers and linking a IORING_OP_TIMEOUT submission with the actually intended
> IO) to allow testing smaller delays.
>

Could be useful testing facility, if it's done in a way that does not
limit the IO concurrency (i.e. the delay should probably be when
consuming the IO, depending on the timestamp of the IO start).

>
>>> That would make sense. You can already tell when that's happened by
>>> comparing the details shown by EXPLAIN ANALYZE against the same query
>>> execution on master, but that approach is inconvenient. Automating my
>>> microbenchmarks has proven to be important with this project. There's
>>> quite a few competing considerations, and it's too easy to improve one
>>> query at the cost of regressing another.
>>>
>>
>> What counts as "unconsumed IO"? The IOs the stream already started, but
>> then did not consume? That shouldn't be hard, I think.
>
> Yes, the number of IOs that were started but not consumed. Or, even better,
> the number of IOs that completed but were not consumed - but that'd be harder
> to get right now.
>
> I agree that started-but-not-consumed should be pretty easy.
>

I'll try to add it to the EXPLAIN.

regards

--
Tomas Vondra

In response to

Re: index prefetching at 2026-02-18 04:21:28 from Andres Freund

Responses

Re: index prefetching at 2026-02-22 16:23:11 from Alexandre Felipe

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Nathan Bossart	2026-02-18 15:46:43	Re: add assertion for palloc in signal handlers
Previous Message	Peter Eisentraut	2026-02-18 15:28:02	Re: SQL Property Graph Queries (SQL/PGQ)