| From: | Dmitry Dolgov <9erthalion6(at)gmail(dot)com> |
|---|---|
| To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
| Cc: | pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Randomize B-Tree page split location to avoid oscillating patterns |
| Date: | 2026-06-16 10:13:47 |
| Message-ID: | jotfal6422zpqvux4add7xcyftaujjiksscfcc5uuibaldw4sv@pzriktpx4gyl |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> On Thu, May 07, 2026 at 08:10:30PM +0200, Dmitry Dolgov wrote:
>
> To summarize, in comparison with the second approach the third one seems to
> have less overhead, the same guarantees regarding penalty, and more complex
> implementation. I'm fine going with the second approach, but first would like
> to discuss the alternatives and make sure we're on the same page.
Here is a sketch of what I had in mind. First two patches are metrics
only (adding USDT, I think 0001 makes sense to have in general, 0002 is
purely for benchmarking). The third patch implements strategy of
collecting equally good split points and choosing from them, and an
alternative version does the same via shifting the search interval
(attached with the txt suffix to not confuse the CF bot, whatever it
does nowadays).
I've managed to perform the same test as before (a best case scenario,
single column unique index) with a bit more throughput, and both
versions showed the same improvement over the main branch (see the
splits.png, where the patch versions are marked as "Rand (best loc)" and
"Rand (shifted)" for searching equally good split points and shifted
search interval correspondingly). It was also enough to produce visible
improvement for latency stddev as reported by pgbench (see stddev.png).
Measuring latency between newly introduced USDT inside _bt_findsplitloc
in this test shows that the latter version (shifted interval) is marginally
faster, probably due to the former needing to remember new data (equally good
split points):
// Time spent in _bt_findsplitloc for "shifted" version
@nsecs:
[4, 8) 2 | |
[8, 16) 9433 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[16, 32) 11365 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[32, 64) 3940 |@@@@@@@@@@@@@@@@@@ |
[64, 128) 1255 |@@@@@ |
[128, 256) 595 |@@ |
[256, 512) 72 | |
[512, 1K) 0 | |
[1K, 2K) 0 | |
[2K, 4K) 1 | |
// Time spent in _bt_findsplitloc for "equally good points" version
@nsecs:
[4, 8) 1 | |
[8, 16) 8265 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[16, 32) 10391 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[32, 64) 4982 |@@@@@@@@@@@@@@@@@@@@@@@@ |
[64, 128) 1155 |@@@@@ |
[128, 256) 698 |@@@ |
[256, 512) 79 | |
[512, 1K) 2 | |
[1K, 2K) 0 | |
[2K, 4K) 10 | |
| Attachment | Content-Type | Size |
|---|---|---|
| v2-0001-Add-USDT-for-nbtree-page-splits-and-suffix-trunca.patch | text/plain | 2.5 KB |
| v2-0002-Add-USDT-for-split-location-benchmarking.patch | text/plain | 1.6 KB |
| v2-0003-Randomize-nbtree-split-location-to-avoid-oscillat.patch | text/plain | 4.4 KB |
| v2-0003-Randomize-nbtree-split-location-to-avoid-oscillat.patch.txt | text/plain | 4.5 KB |
|
image/png | 59.0 KB |
|
image/png | 47.8 KB |
| From | Date | Subject | |
|---|---|---|---|
| Previous Message | Bertrand Drouvot | 2026-06-16 10:09:48 | Re: Avoid orphaned objects dependencies, take 3 |