Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2)

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, David Christensen <david(at)pgguru(dot)net>
Subject: Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2)
Date: 2023-11-01 06:47:18
Message-ID: CAFj8pRD-WPgDy0o6CYwbAqQLsU=Z2OKOK5u48LrfCVx2rH5D+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

út 31. 10. 2023 v 22:12 odesílatel Matthias van de Meent <
boekewurm+postgres(at)gmail(dot)com> napsal:

> Hi,
>
> Currently, nbtree code compares each and every column of an index
> tuple during the binary search on the index page. With large indexes
> that have many duplicate prefix column values (e.g. an index on (bool,
> bool, uuid) ) that means a lot of wasted time getting to the right
> column.
>
> The attached patch improves on that by doing per-page dynamic prefix
> truncation: If we know that on both the right and left side there are
> index tuples where the first two attributes are equal to the scan key,
> we skip comparing those attributes at the current index tuple and
> start with comparing attribute 3, saving two attribute compares. We
> gain performance whenever comparing prefixing attributes is expensive
> and when there are many tuples with a shared prefix - in unique
> indexes this doesn't gain much, but we also don't lose much in this
> case.
>
> This patch was originally suggested at [0], but it was mentioned that
> they could be pulled out into it's own thread. Earlier, the
> performance gains were not clearly there for just this patch, but
> after further benchmarking this patch stands on its own for
> performance: it sees no obvious degradation of performance, while
> gaining 0-5% across various normal indexes on the cc-complete sample
> dataset, with the current worst-case index shape getting a 60%+
> improved performance on INSERTs in the tests at [0].
>

+1

This can be nice functionality. I had a customer with a very slow index
scan - the main problem was a long common prefix like prg010203xxxx.

Regards

Pavel

> Kind regards,
>
> Matthias van de Meent
> Neon (https://neon.tech)
>
> PS. Best served with the downlink right separator/HIKEY optimization
> (separate patch to be submitted soon), and specialization over at [0].
>
> [0]
> https://www.postgresql.org/message-id/CAEze2WiqOONRQTUT1p_ZV19nyMA69UU2s0e2dp+jSBM=j8snuw@mail.gmail.com
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeevan Chalke 2023-11-01 07:00:02 Re: More new SQL/JSON item methods
Previous Message John Naylor 2023-11-01 06:33:26 Commitfest manager November 2023