Re: pg_stats and range statistics

From: Egor Rogov <e(dot)rogov(at)postgrespro(dot)ru>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_stats and range statistics
Date: 2023-01-21 18:53:20
Message-ID: d30615c2-970e-7397-50e9-f08c540cda18@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tomas,

On 21.01.2023 00:50, Tomas Vondra wrote:
> Hi Egor,
>
> While reviewing a patch improving join estimates for ranges [1] I
> realized we don't show stats for ranges in pg_stats, and I recalled we
> had this patch.
>
> I rebased the v2, and I decided to took a stab at showing separate
> histograms for lower/upper histogram bounds. I believe it makes it way
> more readable, which is what pg_stats is about IMHO.

Thanks for looking into this.

I have to admit it looks much better this way, so +1.

> This simply adds two functions, accepting/producing anyarray - one for
> lower bounds, one for upper bounds. I don't think it can be done with a
> plain subquery (or at least I don't know how).

Anyarray is an alien to SQL, so functions are well justified here. What
makes me a bit uneasy is two almost identical functions. Should we
consider other options like a function with an additional parameter or a
function returning an array of bounds arrays (which is somewhat
wasteful, but probably it doesn't matter much here)?

> Finally, it renames the empty_range_frac to start with range_, per the
> earlier discussion. I wonder if the new column names for lower/upper
> bounds (range_lower_bounds_histograms/range_upper_bounds_histograms) are
> too long ...

It seems so. The ending -s should be left out since it's a single
histogram now. And I think that
range_lower_histogram/range_upper_histogram are descriptive enough.

I'm adding one more patch to shorten the column names, refresh the docs,
and make 'make check' happy (unfortunately, we have to edit
src/regress/expected/rules.out every time pg_stats definition changes).

>
> regards
>
> [1] https://commitfest.postgresql.org/41/3821/
>

Attachment Content-Type Size
0001-Display-length-and-bounds-histograms-in-pg_-20230120.patch text/plain 6.2 KB
0002-add-functions-to-extract-lower-upper-ramge--20230120.patch text/plain 6.4 KB
0003-rename-empty_range_frac-20230120.patch text/plain 965 bytes
0004-shorten_names-refresh_docs-tests-20230121.patch text/plain 4.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2023-01-21 18:57:15 Re: pgindent vs variable declaration across multiple lines
Previous Message Bruce Momjian 2023-01-21 18:18:25 Re: Unicode grapheme clusters