Re: index prefetching

From: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(at)vondra(dot)me>, Peter Geoghegan <pg(at)bowt(dot)ie>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: index prefetching
Date: 2025-07-21 06:53:45
Message-ID: CAN55FZ0y57uEK+Ts8s3NV9Gyg=YiAV-y610XJpfS+jdCh_7f5g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Mon, 21 Jul 2025 at 03:59, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> On Sun, Jul 20, 2025 at 1:07 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> > On Sat, Jul 19, 2025 at 11:23 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
> > > The thing that however concerns me is that what I observed was not the
> > > distance getting reset to 1, and then ramping up. Which should happen
> > > pretty quickly, thanks to the doubling. In my experiments it *never*
> > > ramped up again, it stayed at 1. I still don't quite understand why.
> >
> > Huh. Will look into that on Monday.
>
> I suspect that it might be working as designed, but suffering from a
> bit of a weakness in the distance control algorithm, which I described
> in another thread[1]. In short, the simple minded algorithm that
> doubles on miss and subtracts one on hit can get stuck alternating
> between 1 and 2 if you hit certain patterns. Bilal pinged me off-list
> to say that he'd repro'd something like your test case and that's what
> seemed to be happening, anyway? I will dig out my experimental
> patches that tried different adjustments to escape from that state....

I used Tomas Vondra's test [1]. I tracked how many times
StartReadBuffersImpl() functions return true (IO is needed) and false
(IO is not needed, cache hit). It returns true ~%6 times on both
simple and complex patches (~116000 times true, ~1900000 times false
on both patches).

A complex patch ramps up to ~250 distance at the start of the stream
and %6 is enough to stay at distance. Actually, it is enough to ramp
up more but it seems the max distance is about ~270 so it stays there.
On the other hand, a simple patch doesn't ramp up at the start of the
stream and %6 is not enough to ramp up. It is always like distance is
1 and IO needed, so multiplying the distance by 2 -> distance = 2 but
then the next block is cached, so decreasing the distance by 1 and
distance is 1 again.

[1] https://www.postgresql.org/message-id/aa46af80-5219-47e6-a7d0-7628106965a6%40vondra.me

--
Regards,
Nazir Bilal Yavuz
Microsoft

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sandeep Thakkar 2025-07-21 07:16:03 Re: libxml2 author overwhelmed with security requests
Previous Message Amit Kapila 2025-07-21 06:52:49 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart