Re: Collecting statistics about contents of JSONB columns

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Zhihong Yu <zyu(at)yugabyte(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, Mahendra Thalor <mahendra(dot)thalor(at)enterprisedb(dot)com>, Oleg Bartunov <obartunov(at)postgrespro(dot)ru>
Subject: Re: Collecting statistics about contents of JSONB columns
Date: 2022-01-06 19:22:33
Message-ID: c618b784-4295-fb9b-d9ba-e0970ca68ca0@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/1/22 22:16, Zhihong Yu wrote:
> Hi,
>
> +static JsonPathStats
> +jsonStatsFindPathStats(JsonStats jsdata, char *path, int pathlen)
>
> Stats appears twice in the method name. I think findJsonPathStats()
> should suffice.
> It should check `if (jsdata->nullfrac >= 1.0)`
> as jsonStatsGetPathStatsStr does.
>
> +JsonPathStats
> +jsonStatsGetPathStatsStr(JsonStats jsdata, const char *subpath, int
> subpathlen)
>
> This func can be static, right ?
> I think findJsonPathStatsWithPrefix() would be a better name for the func.
>
> + * XXX Doesn't this need ecape_json too?
> + */
> +static void
> +jsonPathAppendEntryWithLen(StringInfo path, const char *entry, int len)
> +{
> +   char *tmpentry = pnstrdup(entry, len);
> +   jsonPathAppendEntry(path, tmpentry);
>
> ecape_json() is called within jsonPathAppendEntry(). The XXX comment can
> be dropped.
>
> +jsonPathStatsGetArrayIndexSelectivity(JsonPathStats pstats, int index)
>
> It seems getJsonSelectivityWithArrayIndex() would be a better name.
>

Thanks. I'll think about the naming changes.

> +       sel = scalarineqsel(NULL, operator,
> +                           operator == JsonbGtOperator ||
> +                           operator == JsonbGeOperator,
> +                           operator == JsonbLeOperator ||
> +                           operator == JsonbGeOperator,
>
> Looking at the comment for scalarineqsel():
>
>  *  scalarineqsel       - Selectivity of "<", "<=", ">", ">=" for scalars.
>  *
>  * This is the guts of scalarltsel/scalarlesel/scalargtsel/scalargesel.
>  * The isgt and iseq flags distinguish which of the four cases apply.
>
> It seems JsonbLtOperator doesn't appear in the call, can I ask why ?
>

Because the scalarineqsel signature is this

scalarineqsel(PlannerInfo *root, Oid operator, bool isgt, bool iseq,
Oid collation,
VariableStatData *vardata, Datum constval,
Oid consttype)

so

/* is it greater or greater-or-equal */
isgt = operator == JsonbGtOperator ||
operator == JsonbGeOperator

/* is it equality? */
iseq = operator == JsonbLeOperator ||
operator == JsonbGeOperator,

So I think this is correct. A comment explaining this would be nice.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2022-01-06 19:23:57 Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
Previous Message Tom Lane 2022-01-06 19:21:15 Re: Index-only scan for btree_gist turns bpchar to char