Re: PoC/WIP: Extended statistics on expressions

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PoC/WIP: Extended statistics on expressions
Date: 2021-01-08 02:35:37
Message-ID: 20210108023537.GA19743@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 08, 2021 at 01:57:29AM +0100, Tomas Vondra wrote:
> Attached is a patch fixing most of the issues. There are a couple
> exceptions:

In the docs:

+ &mdash; at the cost that its schema must be extended whenever the structure
+ of statistics <link linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link> changes.

should say "of statistics *IN* pg_statistics changes" ?

+ to an expression index. The full variant allows defining statistics objects
+ on multiple columns and expressions, and pick which statistics kinds will
+ be built. The per-expression statistics are built automatically when there

"and pick" is wrong - maybe say "and selecting which.."

+ and run a query using an expression on that column. Without the

remove "the" ?

+ extended statistics, the planner has no information about data
+ distribution for reasults of those expression, and uses default

*results

+ estimates as illustrated by the first query. The planner also does
+ not realize the value of the second column fully defines the value
+ of the other column, because date truncated to day still identifies
+ the month). Then expression and ndistinct statistics are built on

The ")" is unbalanced

+ /* all parts of thi expression are covered by this statistics */

this

+ * GrouExprInfos, but only if it's not known equal to any of the existing

Group

+ * we don't allow specifying any statistis kinds. The simple variant

statistics

+ * If no statistic type was specified, build them all (but request

Say "kind" not "type" ?

+ * expression is a simple Var. OTOH we check that there's at least one
+ * statistics matching the expression.

one statistic (singular) ?

+ * the future, we might consider
+ */

consider ???

+-- (not it fails, when there are no simple column references)

note?

There's some remaining copy/paste stuff from index expressions:

errmsg("statistics expressions and predicates can refer only to the table being indexed")));
left behind by evaluating the predicate or index expressions.
Set up for predicate or expression evaluation
Need an EState for evaluation of index expressions and
/* Compute and save index expression values */
left behind by evaluating the predicate or index expressions.
Fetch function for analyzing index expressions.
partial-index predicates. Create it in the per-index context to be
* When analyzing an expression index, believe the expression tree's type

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-01-08 02:50:43 Re: Single transaction in the tablesync worker?
Previous Message Michael Paquier 2021-01-08 02:29:53 Re: Incorrect allocation handling for cryptohash functions with OpenSSL