Re: Improving btree performance through specializing by key shape, take 2

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To:
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improving btree performance through specializing by key shape, take 2
Date: 2022-06-05 19:12:36
Message-ID: CAEze2Whte2hhjgvD-KKtwFJ+j+FEhPUtMp6b6WhTzqE8r94X9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 16 Apr 2022 at 01:05, Matthias van de Meent
<boekewurm+postgres(at)gmail(dot)com> wrote:
> Still to-do:
>
> - Validate performance and share the numbers for the same test indexes
> in [1]. I'm planning on doing that next Monday.

While working on benchmarking the v2 patchset, I noticed no
improvement on reindex, which I attributed to forgetting to also
specialize comparetup_index_btree in tuplesorth.c. After adding the
specialization there as well (attached in v3), reindex performance
improved significantly too.

Performance results attached in pgbench_log.[master,patched], which
includes the summarized output. Notes on those results:

- single-column reindex seem to have the same performance between
patch and master, within 1% error margins.
- multi-column indexes with useful ->attcacheoff also sees no obvious
performance degradation
- multi-column indexes with no useful ->attcacheoff see significant
insert performance improvement:
-8% runtime on 3 text attributes with default (C) collation ("ccl");
-9% runtime on 1 'en_US'-collated attribute + 2 text attributes
("ccl_collated");
-13% runtime on 1 null + 3 text attributes ("accl");
-74% runtime (!) on 31 'en_US'-collated 0-length text attributes + 1
uuid attribute ("worstcase" - I could not think of a worse index shape
than this one).
- reindex performance gains are much more visible: up to 84% (!) less
time spent for "worstcase", and 18-31% for the other multi-column
indexes mentioned above.

Other notes:
- The dataset I used is the same as I used in [1]: the pp-complete.csv
as was available on 2021-06-20, containing 26070307 rows.
- The performance was measured on 7 runs of the attached bench script,
using pgbench to measure statement times etc.
- Database was initialized with C locale, all tables are unlogged and
source table was VACUUM (FREEZE, ANALYZE)-ed before starting.
- (!) On HEAD @ 5bb2b6ab, INSERT is faster than REINDEX for the
"worstcase" index. I've not yet discovered why (only lightly looked
into it, no sort debugging), and considering that the issue does not
appear in similar quantities in the patched version, I'm not planning
on putting a lot of time into that.
- Per-transaction results for the run on master were accidentally
deleted, I don't consider them important enough to re-run the
benchmark.

> - Decide whether / how to keep the NBTS_ENABLED flag. The current
> #define in nbtree.h is a bad example of a compile-time configuration,
> that should be changed (even if we only want to be able to disable
> specialization at compile-time, it should be moved).
>
> Maybe:
>
> - More tests: PG already extensively tests the btree code while it is
> running the test suite - btree is the main index AM - but more tests
> might be needed to test the validity of the specialized code.

No work on those points yet.

I'll add this to CF 2022-07 for tracking.

Kind regards,

Matthias van de Meent.

[1] https://www.postgresql.org/message-id/flat/CAEze2WhyBT2bKZRdj_U0KS2Sbewa1XoO_BzgpzLC09sa5LUROg%40mail.gmail.com#fe3369c4e202a7ed468e47bf5420f530

Attachment Content-Type Size
performance_comparison.txt text/plain 723 bytes
v3-0003-Specialize-the-nbtree-rd_indam-entry.patch application/x-patch 4.0 KB
v3-0001-Specialize-nbtree-functions-on-btree-key-shape.patch application/x-patch 233.6 KB
v3-0002-Use-specialized-attribute-iterators-in-backend-nb.patch application/x-patch 9.3 KB
v3-0004-Optimize-attribute-iterator-access-for-single-col.patch application/x-patch 3.2 KB
v3-0005-Add-a-function-whose-task-it-is-to-populate-all-a.patch application/x-patch 3.6 KB
v3-0008-Add-specialization-to-btree-index-creation.patch application/x-patch 12.0 KB
v3-0007-Implement-dynamic-prefix-compression-in-nbtree.patch application/x-patch 20.3 KB
v3-0006-Implement-specialized-uncacheable-attribute-itera.patch application/x-patch 9.3 KB
pgbench_log.master application/octet-stream 2.8 KB
pgbench_log.patched application/octet-stream 3.1 KB
bench.sql application/octet-stream 3.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2022-06-05 19:28:02 Reducing Memory Consumption (aset and generation)
Previous Message Anton A. Melnikov 2022-06-05 10:38:01 Re: [PATCH] Fix pg_upgrade test from v10