Re: Collecting statistics about contents of JSONB columns

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Mahendra Singh Thalor <mahi6run(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Mahendra Thalor <mahendra(dot)thalor(at)enterprisedb(dot)com>, Oleg Bartunov <obartunov(at)postgrespro(dot)ru>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Collecting statistics about contents of JSONB columns
Date: 2022-04-08 00:31:22
Message-ID: 20220408003122.GF24419@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I noticed some typos.

diff --git a/src/backend/utils/adt/jsonb_selfuncs.c b/src/backend/utils/adt/jsonb_selfuncs.c
index f5520f88a1d..d98cd7020a1 100644
--- a/src/backend/utils/adt/jsonb_selfuncs.c
+++ b/src/backend/utils/adt/jsonb_selfuncs.c
@@ -1342,7 +1342,7 @@ jsonSelectivityContains(JsonStats stats, Jsonb *jb)
path->stats = jsonStatsFindPath(stats, pathstr.data,
pathstr.len);

- /* Appeend path string entry for array elements, get stats. */
+ /* Append path string entry for array elements, get stats. */
jsonPathAppendEntry(&pathstr, NULL);
pstats = jsonStatsFindPath(stats, pathstr.data, pathstr.len);
freq = jsonPathStatsGetFreq(pstats, 0.0);
@@ -1367,7 +1367,7 @@ jsonSelectivityContains(JsonStats stats, Jsonb *jb)
case WJB_END_ARRAY:
{
struct Path *p = path;
- /* Absoulte selectivity of the path with its all subpaths */
+ /* Absolute selectivity of the path with its all subpaths */
Selectivity abs_sel = p->sel * p->freq;

/* Pop last path entry */
diff --git a/src/backend/utils/adt/jsonb_typanalyze.c b/src/backend/utils/adt/jsonb_typanalyze.c
index 7882db23a87..9a759aadafb 100644
--- a/src/backend/utils/adt/jsonb_typanalyze.c
+++ b/src/backend/utils/adt/jsonb_typanalyze.c
@@ -123,10 +123,9 @@ typedef struct JsonScalarStats
/*
* Statistics calculated for a set of values.
*
- *
* XXX This seems rather complicated and needs simplification. We're not
* really using all the various JsonScalarStats bits, there's a lot of
- * duplication (e.g. each JsonScalarStats contains it's own array, which
+ * duplication (e.g. each JsonScalarStats contains its own array, which
* has a copy of data from the one in "jsons").
*/
typedef struct JsonValueStats
@@ -849,7 +848,7 @@ jsonAnalyzePathValues(JsonAnalyzeContext *ctx, JsonScalarStats *sstats,
stats->stanullfrac = (float4)(1.0 - freq);

/*
- * Similarly, we need to correct the MCV frequencies, becuse those are
+ * Similarly, we need to correct the MCV frequencies, because those are
* also calculated only from the non-null values. All we need to do is
* simply multiply that with the non-NULL frequency.
*/
@@ -1015,7 +1014,7 @@ jsonAnalyzeBuildPathStats(JsonPathAnlStats *pstats)

/*
* We keep array length stats here for queries like jsonpath '$.size() > 5'.
- * Object lengths stats can be useful for other query lanuages.
+ * Object lengths stats can be useful for other query languages.
*/
if (vstats->arrlens.values.count)
jsonAnalyzeMakeScalarStats(&ps, "array_length", &vstats->arrlens.stats);
@@ -1069,7 +1068,7 @@ jsonAnalyzeCalcPathFreq(JsonAnalyzeContext *ctx, JsonPathAnlStats *pstats,
* We're done with accumulating values for this path, so calculate the
* statistics for the various arrays.
*
- * XXX I wonder if we could introduce some simple heuristict on which
+ * XXX I wonder if we could introduce some simple heuristic on which
* paths to keep, similarly to what we do for MCV lists. For example a
* path that occurred just once is not very interesting, so we could
* decide to ignore it and not build the stats. Although that won't
@@ -1414,7 +1413,7 @@ compute_json_stats(VacAttrStats *stats, AnalyzeAttrFetchFunc fetchfunc,

/*
* Collect and analyze JSON path values in single or multiple passes.
- * Sigle-pass collection is faster but consumes much more memory than
+ * Single-pass collection is faster but consumes much more memory than
* collecting and analyzing by the one path at pass.
*/
if (ctx.single_pass)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2022-04-08 00:46:13 Re: [Proposal] vacuumdb --schema only
Previous Message Michael Paquier 2022-04-08 00:22:38 Re: REINDEX blocks virtually any queries but some prepared queries.