Re: PoC/WIP: Extended statistics on expressions

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PoC/WIP: Extended statistics on expressions
Date: 2021-03-24 17:18:16
Message-ID: 20210324171816.GC15100@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 24, 2021 at 05:15:46PM +0000, Dean Rasheed wrote:
> On Wed, 24 Mar 2021 at 16:48, Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> >
> > As for the changes proposed in the create_statistics, do we really want
> > to use univariate / multivariate there? Yes, the terms are correct, but
> > I'm not sure how many people looking at CREATE STATISTICS will
> > understand them.
> >
>
> Hmm, I think "univariate" and "multivariate" are pretty ubiquitous,
> when used to describe statistics. You could use "single-column" and
> "multi-column", but then "column" isn't really right anymore, since it
> might be a column or an expression. I can't think of any other terms
> that fit.

We already use "multivariate", just not in create-statistics.sgml

doc/src/sgml/perform.sgml: <firstterm>multivariate statistics</firstterm>, which can capture
doc/src/sgml/perform.sgml: it's impractical to compute multivariate statistics automatically.
doc/src/sgml/planstats.sgml: <sect1 id="multivariate-statistics-examples">
doc/src/sgml/planstats.sgml: <secondary>multivariate</secondary>
doc/src/sgml/planstats.sgml: multivariate statistics on the two columns:
doc/src/sgml/planstats.sgml: <sect2 id="multivariate-ndistinct-counts">
doc/src/sgml/planstats.sgml: But without multivariate statistics, the estimate for the number of
doc/src/sgml/planstats.sgml: This section introduces multivariate variant of <acronym>MCV</acronym>
doc/src/sgml/ref/create_statistics.sgml: and <xref linkend="multivariate-statistics-examples"/>.
doc/src/sgml/release-13.sgml:2020-01-13 [eae056c19] Apply multiple multivariate MCV lists when possible

So I think the answer is for create-statistics to expose that word in a
user-facing way in its reference to multivariate-statistics-examples.

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2021-03-24 17:23:23 Re: [HACKERS] Custom compression methods
Previous Message Dean Rasheed 2021-03-24 17:15:46 Re: PoC/WIP: Extended statistics on expressions