Re: PoC: prefetching index leaf pages (for inserts)

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: PoC: prefetching index leaf pages (for inserts)
Date: 2023-11-23 13:26:23
Message-ID: 7ab3fb79-bdb3-4893-a91d-1a4182aa6da1@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06/11/2023 19:05, Tomas Vondra wrote:
> As for the path forward, I think the prefetching is demonstrably
> beneficial. There are cases where it can't help or even harms
> performance. I think the success depends on three areas:
>
> (a) reducing the costs of the prefetching - For example right now we
> build the index tuples twice (once for prefetch, once for the insert),
> but maybe there's a way to do that only once? There are also predicate
> indexes, and so on.
>
> (b) being smarter about when to prefetch - For example if we only have
> one "prefetchable" index, it's somewhat pointless to prefetch (for
> single-row cases). And so on.
>
> (c) not prefetching when already cached - This is somewhat related to
> the previous case, but perhaps it'd be cheaper to first check if the
> data is already cached. For shared buffers it should not be difficult,
> for page cache we could use preadv2 with RWF_NOWAIT flag. The question
> is if this is cheap enough to be cheaper than just doing posix_fadvise
> (which however only deals with shared buffers).

I don't like this approach. It duplicates the tree-descend code, and it
also duplicates the work of descending the tree at runtime. And it only
addresses index insertion; there are a lot of places that could benefit
from prefetching or async execution like this.

I think we should think of this as async execution rather than
prefetching. We don't have the general infrastructure for writing async
code, but if we did, this would be much simpler. In an async programming
model, like you have in many other languages like Rust, python or
javascript, there would be no separate prefetching function. Instead,
aminsert() would return a future that can pause execution if it needs to
do I/O. Something like this:

aminsert_futures = NIL;
/* create a future for each index insert */
for (<all indexes>)
{
aminsert_futures = lappend(aminsert_futures, aminsert(...));
}
/* wait for all the futures to finish */
await aminsert_futures;

The async-aware aminsert function would run to completion quickly if all
the pages are already in cache. If you get a cache miss, it would start
an async I/O read for the page, and yield to the other insertions until
the I/O completes.

We already support async execution of FDWs now, with the
ForeignAsyncRequest() and ForeignAsyncConfigureWait() callbacks. Can we
generalize that?

--
Heikki Linnakangas
Neon (https://neon.tech)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2023-11-23 13:32:21 Re: pgsql: meson: docs: Add {html,man} targets, rename install-doc-*
Previous Message Aleksander Alekseev 2023-11-23 13:13:30 Re: Catalog domain not-null constraints