Re: index prefetching

From: Alexandre Felipe <o(dot)alexandre(dot)felipe(at)gmail(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Peter Geoghegan <pg(at)bowt(dot)ie>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: index prefetching
Date: 2026-02-16 10:12:12
Message-ID: CAE8JnxOzJQrb44S216Mp71dpqmsaaH5=unJB0zHgVa-+ODPMQA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> How did you do that? Did you increase the number of rows, make the rows
> wider (by increasing the 'repeat' parameter in the script), or something
> else? Did you verify the table really is 1000x larger?
>

I increased the number of rows by 1000. I didn't really check the size of
the table, the time increase suggests that it was right.

> The "10k table row" means repeat('x',10000) when generating data? Oh, I
> see you're using some string_agg(), to make it not compress. But note
> that if it's TOASTed, it become entirely irrelevant for the prefetching
> test because it's in a separate relation.
>

sorry, 10k row table, for the payload I left a note

> [b] This time I used a (SELECT string_agg((i*j)::text, '+') FROM
> generate_series(1, 50)) instead of repeat('x', 100), just to prevent it
> from compressing to nothing when I try larger payloads, and hit the
> TOAST thresholds. I removed the primary key `id` because it was annoying
> to take 20 minutes to insert the data in the large scale test.

> Unfortunately, you have not included the new script, so we can't try
> reproducing your results.
>

Let me try to find something not so insane.

128kB shared buffers is a little bit ... insane. I refuse to optimize
> anything for this value, and I don't even call about regressions. Even
> 128MB is not really practical, any serious system caring about
> performance will use tens or hundreds of GBs of shared buffers.
>

I am not going to dispute that

> If the tests I am doing are pointless, should we consider having
> > something in the planner to prevent these scans from using prefetch?
> >
>
> How would you do that? Please explain.
>

I have no idea. But based on what you said I thought you would know. In my
head: "If my test seemed ridiculous to them all, they have some clear
boundaries in their mind that they could write in the planner".

> ... or you could modify the script to simply use sudo.
>
In that case sudo would request a password to the caller, and the caller is
a python script, no interaction there, of course I could do all the steps
manually, but it is more error prone (just my own mistakes are enough).

Regards,
Alexandre

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message VASUKI M 2026-02-16 10:30:45 Re: [OAuth2] Infrastructure for tracking token expiry time
Previous Message Bertrand Drouvot 2026-02-16 10:10:21 Re: Adding locks statistics