Quick Links

Improving btree performance through specializing by key shape, take 2

From:	Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc:	Peter Geoghegan <pg(at)bowt(dot)ie>
Subject:	Improving btree performance through specializing by key shape, take 2
Date:	2022-04-08 16:54:55
Message-ID:	CAEze2Wg52tsSWA9Fy7OCXx-K7pPLMNxA_fmQ6-+_pzR-AoODDA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

Here's a new patch, well-timed for the next feature freeze. [0]

Last year I submitted a patchset that updated the way nbtree keys are
compared [1]; by moving from index_getattr to an iterator-based
approach. That patch did perform quite well for multi-column indexes,
but significantly reduced performance for keys that benefit from
Form_pg_attribute->attcacheoff.

Here's generation 2 of this effort. Instead of proceeding to trying to
shoehorn all types of btree keys into one common code path, this
patchset acknowledges that there exist different shapes of keys that
each have a different best way of accessing each subsequent key
attribute. This patch achieves this by specializing the functions to
different key shapes.

The approach is comparable to JIT, but significantly different in that
it optimizes a set of statically defined shapes, and not any arbitrary
shape through a runtime compilation step. Jit could be implemented,
but not easily with the APIs available to IndexAMs: the amhandler for
indexes does not receive any information what the shape of the index
is going to be; so

0001: Moves code that can benefit from key attribute accessor
specialization out of their current files and into specializable
files.
The functions selected for specialization are either not much more
than wrappers for specializable functions, or functions that have
(hot) loops around specializable code; where specializable means
accessing multiple IndexTuple attributes directly.
0002: Updates the specializable code to use the specialized attribute
iteration macros
0003: Optimizes access to the key column when there's only one key column.
0004: Optimizes access to the key columns when we cannot use
attcacheoff for the key columns
0005: Creates a helper function to populate all attcacheoff fields
with their correct values; populating them with -2 whenever a cache
offset is impossible to determine, as opposed to the default -1
(unknown); allowing functions to determine the cachability of the n-th
attribute in O(1).
0006: Creates a specialization macro that replaces rd_indam->aminsert
with its optimized variant, for improved index tuple insertion
performance.

These patches still have some rough edges (specifically: some
functions that are being generated are unused, and intermediate
patches don't compile), but I wanted to get this out to get some
feedback on this approach.

I expect the performance to be at least on par with current btree
code, and I'll try to publish a more polished patchset with
performance results sometime in the near future. I'll also try to
re-attach dynamic page-level prefix truncation, but that depends on
the amount of time I have and the amount of feedback on this patchset.

-Matthias

[0] The one for PG16, that is.
[1] https://www.postgresql.org/message-id/CAEze2Whwvr8aYcBf0BeBuPy8mJGtwxGvQYA9OGR5eLFh6Q_ZvA@mail.gmail.com

Attachment	Content-Type	Size
v1-0004-Implement-specialized-uncacheable-attribute-itera.patch	application/octet-stream	10.5 KB
v1-0005-Add-a-function-whose-task-it-is-to-populate-all-a.patch	application/octet-stream	3.6 KB
v1-0002-Use-specialized-attribute-iterators-in-nbt-_spec..patch	application/octet-stream	8.6 KB
v1-0003-Optimize-attribute-iterator-access-for-single-col.patch	application/octet-stream	1.5 KB
v1-0001-Specialize-nbtree-functions-on-btree-key-shape.patch	application/octet-stream	234.4 KB
v1-0006-Specialize-the-nbtree-rd_indam-entry.patch	application/octet-stream	4.9 KB

Responses

Re: Improving btree performance through specializing by key shape, take 2 at 2022-04-10 21:44:35 from Peter Geoghegan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Geoghegan	2022-04-08 16:56:34	Re: Lowering the ever-growing heap->pd_lower
Previous Message	Nathan Bossart	2022-04-08 16:53:12	Re: avoid multiple hard links to same WAL file after a crash