From 4eb6062d5a5d5b0b79363bc986f14d595f9cc826 Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthalion6@gmail.com>
Date: Mon, 27 Apr 2026 16:54:36 +0200
Subject: [PATCH v1 2/2] Randomize nbtree split location to avoid oscillating
 patterns
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The way nbtree page split works can lead to the same split location
chosen over and over under certain workloads. To simplify it, as long as
the data to be ingested follows the same distribution as already
existing data, in particular it's true for an empty tree. According to
[1] (and some one-off experiments) this could lead to the number of
splits following an oscillating pattern, meaning some intrinsic
variability in performance.

The easiest workaround is to introduce a range around the best split
location, and pick up the actual split location at random from this
range. Introduce such randomization, based on the split status
containing list of possible locations. The whitepaper mentioned above
recommends range of 20%, so we stick with this range. A list of possible
split locations is sorted by delta, meaning that it's not exactly
equivalent to a "range around the best split location", but looks like
it's close enough.

[1]: Glombiewski N., Seeger B., Graefe G. (2019). Waves of Misery After
Index Creation. BTW 2019. Gesellschaft für Informatik. doi:10.18420/btw2019-06
---
 src/backend/access/nbtree/nbtsplitloc.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/nbtree/nbtsplitloc.c b/src/backend/access/nbtree/nbtsplitloc.c
index de9eca3c8b2..71becf0257e 100644
--- a/src/backend/access/nbtree/nbtsplitloc.c
+++ b/src/backend/access/nbtree/nbtsplitloc.c
@@ -17,6 +17,7 @@
 #include "access/nbtree.h"
 #include "access/tableam.h"
 #include "common/int.h"
+#include "common/pg_prng.h"
 
 typedef enum
 {
@@ -792,6 +793,7 @@ _bt_bestsplitloc(FindSplitData *state, int perfectpenalty,
 	int			bestpenalty,
 				lowsplit;
 	int			highsplit = Min(state->interval, state->nsplits);
+	int			rand_offset = 0;
 	SplitPoint *final;
 
 	bestpenalty = INT_MAX;
@@ -812,7 +814,24 @@ _bt_bestsplitloc(FindSplitData *state, int perfectpenalty,
 			break;
 	}
 
-	final = &state->splits[lowsplit];
+	/*
+	 * There are workloads, where we would find the same best split location
+	 * over and over, even with the suffix truncation introducing some
+	 * variability. According to [1] this leads to the number of splits
+	 * following oscillating pattern, and the easiest workaround is to
+	 * introduce some randomness in chosing split location.
+	 *
+	 * To achieve that add a random shift to the lowsplit, corresponding to the
+	 * 20% of the all possible split locations. Since splits are sorted by
+	 * delta (see _bt_deltasortsplits), it should be close enough to
+	 * introducing a range around the split point.
+	 *
+	 * [1]: Glombiewski N., Seeger B., Graefe G. (2019). Waves of Misery After
+	 * Index Creation. BTW 2019. Gesellschaft für Informatik. doi:10.18420/btw2019-06
+	 */
+	rand_offset = pg_prng_uint64_range(
+		&pg_global_prng_state, 0, state->nsplits * 0.2);
+	final = &state->splits[lowsplit + rand_offset];
 
 	/*
 	 * There is a risk that the "many duplicates" strategy will repeatedly do
-- 
2.52.0