Re: index prefetching

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: index prefetching
Date: 2025-08-28 16:16:07
Message-ID: onbn3rx35x6k7mfnsmejnebt4nahnii3qnjrac2jzdh3puwo6t@dzjzsx5ppaj7
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2025-08-28 14:45:24 +0200, Tomas Vondra wrote:
> On 8/26/25 17:06, Tomas Vondra wrote:
> I kept thinking about this, and in the end I decided to try to measure
> this IPC overhead. The backend/ioworker communicate by sending signals,
> so I wrote a simple C program that does "signal echo" with two processes
> (one fork). It works like this:
>
> 1) fork a child process
> 2) send a signal to the child
> 3) child notices the signal, sends a response signal back
> 4) after receiving response, go back to (2)

Nice!

I think this might under-estimate the IPC cost a bit, because typically the
parent and child process do not want to run at the same time, probably leading
to them often being scheduled on the same core. Whereas a shollow IO queue
will lead to some concurrent activity, just not enough to hide the IPC
latency... But I don't think this matters in the grand scheme of things.

> So I think the IPC overhead with "worker" can be quite significant,
> especially for cases with distance=1. I don't think it's a major issue
> for PG18, because seq/bitmap scans are unlikely to collapse the distance
> like this. And with larger distances the cost amortizes. It's much
> bigger issue for the index prefetching, it seems.

I couldn't keep up with all the discussion, but is there actually valid I/O
bound cases (i.e. not ones were we erroneously keep the distance short) where
index scans end can't have a higher distance?

Obviously you can construct cases with a low distance by having indexes point
to a lot of tiny tuples pointing to perfectly correlated pages, but in that
case IO can't be a significant factor.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2025-08-28 16:23:46 Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt
Previous Message Tomas Vondra 2025-08-28 16:11:18 Re: Changing the state of data checksums in a running cluster