Re: index prefetching

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>
Subject: Re: index prefetching
Date: 2023-12-09 18:08:20
Message-ID: cd63c8bc-3d3c-00ab-f7eb-fa55969c3234@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Here's a simplified version of the patch series, with two important
changes from the last version shared on 2023/11/24.

Firstly, it abandons the idea to use preadv2() to check page cache. This
initially seemed like a great way to check if prefetching is needed, but
in practice it seems so expensive it's not really beneficial (especially
in the "cached" case, which is where it matters most).

Note: There's one more reason to not want rely on preadv2() that I
forgot to mention - it's a Linux-specific thing. I wouldn't mind using
it to improve already acceptable behavior, but it doesn't seem like a
great idea if performance without would be poor.

Secondly, this reworks multiple aspects of the "layering".

Until now, the prefetching info was stored in IndexScanDesc and
initialized in indexam.c in the various "beginscan" functions. That was
obviously wrong - IndexScanDesc is just a description of what the scan
should do, not a place where execution state (which the prefetch queue
is) should be stored. IndexScanState (and IndexOnlyScanState) is a more
appropriate place, so I moved it there.

This also means the various "beginscan" functions don't need any changes
(i.e. not even get prefetch_max), which is nice. Because the prefetch
state is created/initialized elsewhere.

But there's a layering problem that I don't know how to solve - I don't
see how we could make indexam.c entirely oblivious to the prefetching,
and move it entirely to the executor. Because how else would you know
what to prefetch?

With index_getnext_tid() I can imagine fetching XIDs ahead, stashing
them into a queue, and prefetching based on that. That's kinda what the
patch does, except that it does it from inside index_getnext_tid(). But
that does not work for index_getnext_slot(), because that already reads
the heap tuples.

We could say prefetching only works for index_getnext_tid(), but that
seems a bit weird because that's what regular index scans do. (There's a
patch to evaluate filters on index, which switches index scans to
index_getnext_tid(), so that'd make prefetching work too, but I'd ignore
that here. There are other index_getnext_slot() callers, and I don't
think we should accept does not work for those places seems wrong (e.g.
execIndexing/execReplication would benefit from prefetching, I think).

The patch just adds a "prefetcher" argument to index_getnext_*(), and
the prefetching still happens there. I guess we could move most of the
prefether typedefs/code somewhere, but I don't quite see how it could be
done in executor entirely.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
v20231209-0001-prefetch-2023-11-24.patch text/x-patch 56.3 KB
v20231209-0002-reworks.patch text/x-patch 49.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2023-12-09 18:38:57 Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Previous Message Pavel Stehule 2023-12-09 17:59:54 Re: Schema variables - new implementation for Postgres 15