From fb3440742d4c30dcda526eea759d4294975daa27 Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Fri, 16 Sep 2022 18:38:02 +0300 Subject: [PATCH v8] Currently, KnownAssignedXidsGetAndSetXmin requires an iterative loop through KnownAssignedXids array, including xids marked as invalid. Performance impact is especially noticeable in the presence of long (few seconds) transactions on primary, high value (few thousands) of max_connections and high read workload on standby. Most of the CPU spent on looping throw KnownAssignedXids while almost all xid are invalid anyway. KnownAssignedXidsCompress removes invalid xid from time to time, but performance is still affected. To increase performance, frequency of running KnownAssignedXidsCompress is increased. Now it is called for each xid % 64 == 0 (number selected by running benchmarks). Also, the minimum bound of element to compress (4 * PROCARRAY_MAXPROCS) is removed. Simon Riggs, with some editorialization by Michail Nikolaev. --- src/backend/storage/ipc/procarray.c | 29 +++++++++++++++++++---------- 1 file changed, 19 insertions(+), 10 deletions(-) diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index 0555b02a8d..d1f20c3f50 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -4617,18 +4617,22 @@ KnownAssignedXidsCompress(bool force) { /* * If we can choose how much to compress, use a heuristic to avoid - * compressing too often or not often enough. + * compressing too often or not often enough. "Compress" here means + * simply moving the values to the beginning of the array, so is + * not as complex or costly as typical data compression algorithms. * - * Heuristic is if we have a large enough current spread and less than - * 50% of the elements are currently in use, then compress. This - * should ensure we compress fairly infrequently. We could compress - * less often though the virtual array would spread out more and - * snapshots would become more expensive. + * Heuristic is if less than 50% of the elements are currently in + * use, then compress. This ensures time to take a snapshot is + * bounded at S=2N, using the same notation from earlier comments, + * which is essential to avoid limiting scalability with high N. + * + * As noted earlier, compression is O(S), so now O(2N), while frequency + * of compression is now O(1/N) so that as N varies, the algorithm + * balances nicely the frequency and cost of compression. */ int nelements = head - tail; - if (nelements < 4 * PROCARRAY_MAXPROCS || - nelements < 2 * pArray->numKnownAssignedXids) + if (nelements < 2 * pArray->numKnownAssignedXids) return; } @@ -4924,8 +4928,13 @@ KnownAssignedXidsRemoveTree(TransactionId xid, int nsubxids, for (i = 0; i < nsubxids; i++) KnownAssignedXidsRemove(subxids[i]); - /* Opportunistically compress the array */ - KnownAssignedXidsCompress(false); + /* + * Opportunistically compress the array, every N commits. + * 64 is selected by running benchmarks. + */ + if (TransactionIdIsValid(xid) && + ((int) xid) % 64 == 0) + KnownAssignedXidsCompress(false); } /* -- 2.25.1