RE: list of extended statistics on psql

From: "Shinoda, Noriyoshi (PN Japan FSIP)" <noriyoshi(dot)shinoda(at)hpe(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Tatsuro Yamada <tatsuro(dot)yamada(dot)tf(at)nttcom(dot)co(dot)jp>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: list of extended statistics on psql
Date: 2021-01-17 01:41:04
Message-ID: TU4PR8401MB115239DC3FE567BE637F7544EEA50@TU4PR8401MB1152.NAMPRD84.PROD.OUTLOOK.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, hackers.

I tested this committed feature.
It doesn't seem to be available to non-superusers due to the inability to access pg_statistics_ext_data.
Is this the expected behavior?

--- operation ---
postgres=> CREATE STATISTICS stat1_data1 ON c1, c2 FROM data1;
CREATE STATISTICS
postgres=> ANALYZE data1;
ANALYZE
postgres=> SELECT * FROM pg_statistic_ext;
oid | stxrelid | stxname | stxnamespace | stxowner | stxstattarget | stxkeys | stxkind
-------+----------+-------------+--------------+----------+---------------+---------+---------
16393 | 16385 | stat1_data1 | 2200 | 16384 | -1 | 1 2 | {d,f,m}
(1 row)

postgres=> \dX
ERROR: permission denied for table pg_statistic_ext_data
postgres=>
postgres=> \connect postgres postgres
You are now connected to database "postgres" as user "postgres".
postgres=#
postgres=# \dX
List of extended statistics
Schema | Name | Definition | Ndistinct | Dependencies | MCV
--------+-------------+-------------------+-----------+--------------+-----------
public | stat1_data1 | c1, c2 FROM data1 | built | built | requested
(1 row)

--- operation ---

Regards,
Noriyoshi Shinoda

-----Original Message-----
From: Tomas Vondra [mailto:tomas(dot)vondra(at)enterprisedb(dot)com]
Sent: Sunday, January 17, 2021 8:32 AM
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>; Tatsuro Yamada <tatsuro(dot)yamada(dot)tf(at)nttcom(dot)co(dot)jp>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>; Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>; Michael Paquier <michael(at)paquier(dot)xyz>; Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>; PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: list of extended statistics on psql

On 1/15/21 5:19 PM, Tomas Vondra wrote:
>
>
> On 1/15/21 9:47 AM, Julien Rouhaud wrote:
>> On Wed, Jan 13, 2021 at 10:22:05AM +0900, Tatsuro Yamada wrote:
>>> Hi Tomas,
>>>
>>> On 2021/01/13 7:48, Tatsuro Yamada wrote:
>>>> On 2021/01/12 20:08, Tomas Vondra wrote:
>>>>> On 1/12/21 2:57 AM, Tatsuro Yamada wrote:
>>>>>> On 2021/01/09 9:01, Tomas Vondra wrote:
>>>>> ...>
>>>>>>> While working on that, I realized that 'defined' might be a bit
>>>>>>> ambiguous, I initially thought it means 'NOT NULL' (which it does not).
>>>>>>> I propose to change it to 'requested' instead. Tatsuro, do you
>>>>>>> agree, or do you think 'defined' is better?
>>>>>>
>>>>>> Regarding the status of extended stats, I think the followings:
>>>>>>
>>>>>>   - "defined": it shows the extended stats defined only. We
>>>>>> can't know
>>>>>>                whether it needs to analyze or not. I agree this
>>>>>> name was
>>>>>>                 ambiguous. Therefore we should replace it with a
>>>>>> more suitable
>>>>>>                name.
>>>>>>   - "requested": it shows the extended stats needs something. Of
>>>>>> course,
>>>>>>                we know it needs to ANALYZE because we can create the patch.
>>>>>>                However, I feel there is a little ambiguity for DBA.
>>>>>>                To solve this, it would be better to write an
>>>>>> explanation of
>>>>>>                the status in the document. For example,
>>>>>>
>>>>>> ======
>>>>>> The column of the kind of extended stats (e. g. Ndistinct) shows some statuses.
>>>>>> "requested" means that it needs to gather data by ANALYZE.
>>>>>> "built" means ANALYZE
>>>>>>   was finished, and the planner can use it. NULL means that it doesn't exists.
>>>>>> ======
>>>>>>
>>>>>> What do you think? :-D
>>>>>>
>>>>>
>>>>> Yes, that seems reasonable to me. Will you provide an updated patch?
>>>>
>>>>
>>>> Sounds good. I'll send the updated patch today.
>>>
>>>
>>>
>>> I updated the patch to add the explanation of the extended stats' statuses.
>>> Please feel free to modify the patch to improve it more clearly.
>>>
>>> The attached files are:
>>> 0001: Add psql \dx and the fixed document
>>> 0002: Regression test for psql \dX
>>> app-psql.html: Created by "make html" command (You can check the
>>> explanation of the statuses easily, probably)
>>
>> Hello Yamada-san,
>>
>> I reviewed the patch and don't have specific complaints, it all looks good!
>>
>> I'm however thinking about the "requested" status. I'm wondering if
>> it could lead to people think that an ANALYZE is scheduled and will happen soon.
>> Maybe "defined" or "declared" might be less misleading, or even
>> "waiting for analyze"?
>>
>
> Well, the "defined" option is not great either, because it can be
> interpreted as "NOT NULL" - that's why I proposed "requested". Not
> sure about "declared" - I wouldn't use it in this context, but I'm not
> a native speaker so maybe it's OK.
>

I've pushed this, keeping the "requested". If we decide that some other term is a better choice, we can tweak that later of course.

Thanks Tatsuro-san for the patience!

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2021-01-17 01:43:10 Re: Why does create_gather_merge_plan need make_sort?
Previous Message Tomas Vondra 2021-01-17 00:23:39 Re: PoC/WIP: Extended statistics on expressions