Re: PoC/WIP: Extended statistics on expressions

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PoC/WIP: Extended statistics on expressions
Date: 2021-01-05 14:10:08
Message-ID: CAEZATCWGYooC_Tzt8sQo7760ZEsguj0JGoMLhkANrDwJ9CY20Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 5 Jan 2021 at 00:45, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
> On 1/4/21 4:34 PM, Dean Rasheed wrote:
> >
> > * In src/bin/psql/describe.c, I think the \d output should also
> > exclude the "expressions" stats kind and just list the other kinds (or
> > have no kinds list at all, if there are no other kinds), to make it
> > consistent with the CREATE STATISTICS syntax.
>
> Not sure I understand. Why would this make it consistent with CREATE
> STATISTICS? Can you elaborate?
>

This isn't absolutely essential, but I think it would be neater. For
example, if I have a table with stats like this:

CREATE TABLE foo(a int, b int);
CREATE STATISTICS foo_s_ab (mcv) ON a,b FROM foo;

then the \d output is as follows:

\d foo
Table "public.foo"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | |
b | integer | | |
Statistics objects:
"public"."foo_s_ab" (mcv) ON a, b FROM foo

and the stats line matches the DDL used to create the stats. It could,
for example, be copy-pasted and tweaked to create similar stats on
another table, but even if that's not very likely, it's neat that it
reflects how the stats were created.

OTOH, if there are expressions in the list, it produces something like this:

Table "public.foo"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | |
b | integer | | |
Statistics objects:
"public"."foo_s_ab" (mcv, expressions) ON a, b, ((a * b)) FROM foo

which no longer matches the DDL used, and isn't part of an accepted
syntax, so seems a bit inconsistent.

In general, if we're making the "expressions" kind an internal
implementation detail that just gets built automatically when needed,
then I think we should hide it from this sort of output, so the list
of kinds matches the list that the user used when the stats were
created.

Regards,
Dean

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2021-01-05 14:40:56 Re: [PATCH] Runtime control of CLOBBER_CACHE_ALWAYS
Previous Message Bharath Rupireddy 2021-01-05 12:57:57 Re: Parallel Inserts in CREATE TABLE AS