Re: multivariate statistics (v19)

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, David Steele <david(at)pgmasters(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Álvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: multivariate statistics (v19)
Date: 2016-10-04 07:49:38
Message-ID: CAEZATCV5ZPqvsbJJ77jr4R9beqd=xwVUnMwBkMeCw5zDdrqRNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 30 September 2016 at 12:10, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> I fear that using "statistics" as the name of the new object might get a bit
> awkward. "statistics" is a plural, but we use it as the name of a single
> object, like "pants" or "scissors". Not sure I have any better ideas though.
> "estimator"? "statistics collection"? Or perhaps it should be singular,
> "statistic". I note that you actually called the system table
> "pg_mv_statistic", in singular.
>

I think it's OK. The functional dependency is a single statistic, but
MCV lists and histograms are multiple statistics (multiple facts about
the data sampled), so in general when you create one of these new
objects, you are creating multiple statistics about the data. Also I
find "CREATE STATISTIC" just sounds a bit clumsy compared to "CREATE
STATISTICS".

The convention for naming system catalogs seems to be to use the
singular for tables and plural for views, so I guess we should stick
with that. It doesn't seem like the end of the world that it doesn't
match the user-facing syntax. A bigger concern is the use of "mv" in
the name, because as has already been pointed out, this table may also
in the future be used to store univariate expression and partial
statistics, so I think we should drop the "mv" and go with something
like pg_statistic_ext, or some other more general name.

Regards,
Dean

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2016-10-04 08:02:54 Re: Declarative partitioning - another take
Previous Message Amit Langote 2016-10-04 07:41:01 Re: Transactions involving multiple postgres foreign servers