Re: index prefetching

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Andres Freund <andres(at)anarazel(dot)de>, Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: index prefetching
Date: 2025-09-04 18:55:19
Message-ID: c515d037-9c9c-40d3-9f1f-f0e68bf5842b@vondra.me
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 9/3/25 22:06, Andres Freund wrote:
> ...
>
> I continue to be worried that we're optimizing for queries that have no
> real-world relevance. The regression afaict is contingent on
>
> 1) An access pattern that is unpredictable to the CPU (due to the use of
> random() as part of ORDER BY during the data generation)
>
> 2) Index and heap are somewhat correlated, but fuzzily, i.e. there are
> backward jumps in the heap block numbers being fetched
>

Aren't those two points rather contradictory? Why would it matter that
the data generator uses random() in the ORDER BY? Seems entirely
irrelevant, if the generated table is "somewhat correlated".

Which seems pretty normal in real-world data sets ...

> 3) There are 1 - small_number tuples on one heap tables
>

What would you consider a reasonable number of tuples on one heap page?

The current tests generate data with 20-100 tuples per page, which seems
pretty reasonable to me. I mean, that's 80-400B per tuple. Sure, I could
generate data with narrower tuples, but would that be more realistic? I
doubt that.

FWIW it's not like the regressions only happen on fillfactor=20, with 20
tuples/page. It happens on fillfactor=100 (sure, the impact is smaller).

> 4) The query scans a huge number of tuples, without actually doing any
> meaningful analysis on the tuples. As soon as one does meaningful work for
> returned tuples, the small difference in per-tuple CPU costs vanishes
>

I believe I already responded to this before. Sure, the relative
regression will get smaller. But I don't see why would the absolute
difference get smaller.

> 5) The query visits all heap pages within a range, just not quite in
> order. Without that the kernel readahead would not work and the query's
> performance without readahead would be terrible even on low-latency storage
>

I'm sorry, I don't quite understand what this says :-( Or why would that
mean the issues triggered by the generated data sets are not valid even
for real-world queries.

> This just doesn't strike me as a particularly realistic combination of
> factors?
>

Aren't plenty of real-world data sets correlated, but not perfectly?

In any case, I'm the first one to admit these data sets are synthetic.
It's meant to generate data sets that gradually shift from perfectly
ordered to random, increasing number of duplicates, etc. The point was
to cover a wider range of data sets, not just a couple "usual" ones.

It's possible some of these data sets are not realistic, in which case
we can choose to ignore them and the regressions. The approach tends to
find "adversary" cases, hit corner cases (not necessarily as rare as
assumed), etc. But the issues we ran into so far seem perfectly valid
(or at least useful to think about).

regards

--
Tomas Vondra

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2025-09-04 20:23:06 Re: Stack-based tracking of per-node WAL/buffer usage
Previous Message Nathan Bossart 2025-09-04 18:41:01 Re: GetNamedLWLockTranche crashes on Windows in normal backend