Re: index prefetching

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>
Subject: Re: index prefetching
Date: 2023-10-16 15:34:44
Message-ID: 06bb7d02-2c44-3062-731e-a735ba13da7e@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Attached is a v6 of the patch, which rebases v5 (just some minor
bitrot), and also does a couple changes which I kept in separate patches
to make it obvious what changed.

0001-v5-20231016.patch
----------------------

Rebase to current master.

0002-comments-and-minor-cleanup-20231012.patch
----------------------------------------------

Various comment improvements (remove obsolete ones clarify a bunch of
other comments, etc.). I tried to explain the reasoning why some places
disable prefetching (e.g. in catalogs, replication, ...), explain how
the caching / LRU works etc.

0003-remove-prefetch_reset-20231016.patch
-----------------------------------------

I decided to remove the separate prefetch_reset parameter, so that all
the index_beginscan() methods only take a parameter specifying the
maximum prefetch target. The reset was added early when the prefetch
happened much lower in the AM code, at the index page level, and the
reset was when moving to the next index page. But now after the prefetch
moved to the executor, this doesn't make much sense - the resets happen
on rescans, and it seems right to just reset to 0 (just like for bitmap
heap scans).

0004-PoC-prefetch-for-IOS-20231016.patch
----------------------------------------

This is a PoC adding the prefetch to index-only scans too. At first that
may seem rather strange, considering eliminating the heap fetches is the
whole point of IOS. But if the pages are not marked as all-visible (say,
the most recent part of the table), we may still have to fetch them. In
which case it'd be easy to see cases that IOS is slower than a regular
index scan (with prefetching).

The code is quite rough. It adds a separate index_getnext_tid_prefetch()
function, adding prefetching on top of index_getnext_tid(). I'm not sure
it's the right pattern, but it's pretty much what index_getnext_slot()
does too, except that it also does the fetch + store to the slot.

Note: There's a second patch adding index-only filters, which requires
the regular index scans from index_getnext_slot() to _tid() too.

The prefetching then happens only after checking the visibility map (if
requested). This part definitely needs improvements - for example
there's no attempt to reuse the VM buffer, which I guess might be expensive.

index-prefetch.pdf
------------------

Attached is also a PDF with results of the same benchmark I did before,
comparing master vs. patched with various data patterns and scan types.
It's not 100% comparable to earlier results as I only ran it on a
laptop, and it's a bit noisier too. The overall behavior and conclusions
are however the same.

I was specifically interested in the IOS behavior, so I added two more
cases to test - indexonlyscan and indexonlyscan-clean. The first is the
worst-case scenario, with no pages marked as all-visible in VM (the test
simply deletes the VM), while indexonlyscan-clean is the good-case (no
heap fetches needed).

The results mostly match the expected behavior, particularly for the
uncached runs (when the data is expected to not be in memory):

* indexonlyscan (i.e. bad case) - About the same results as
"indexscans", with the same speedups etc. Which is a good thing
(i.e. IOS is not unexpectedly slower than regular indexscans).

* indexonlyscan-clean (i.e. good case) - Seems to have mostly the same
performance as without the prefetching, except for the low-cardinality
runs with many rows per key. I haven't checked what's causing this,
but I'd bet it's the extra buffer lookups/management I mentioned.

I noticed there's another prefetching-related patch [1] from Thomas
Munro. I haven't looked at it yet, so hard to say how much it interferes
with this patch. But the idea looks interesting.

[1]
https://www.postgresql.org/message-id/flat/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6uT5TUm2gkvA(at)mail(dot)gmail(dot)com

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
index-prefetch.pdf application/pdf 109.4 KB
0001-v5-20231012.patch text/x-patch 40.8 KB
0002-comments-and-minor-cleanup-20231012.patch text/x-patch 31.2 KB
0003-remove-prefetch_reset-20231012.patch text/x-patch 17.5 KB
0004-PoC-prefetch-for-IOS-20231012.patch text/x-patch 11.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2023-10-16 15:45:07 Re: The danger of deleting backup_label
Previous Message Erik Rijkers 2023-10-16 15:34:14 event trigger sgml touch-up