From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Cc: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PoC/WIP: Extended statistics on expressions |
Date: | 2021-09-01 16:45:29 |
Message-ID: | 6bc0ab0c-a807-2b7f-7df8-ac90c7839013@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 8/24/21 3:13 PM, Justin Pryzby wrote:
> On Mon, Aug 16, 2021 at 05:41:57PM +0200, Tomas Vondra wrote:
>>> This still seems odd:
>>>
>>> postgres=# CREATE STATISTICS asf ON i FROM t;
>>> ERROR: extended statistics require at least 2 columns
>>> postgres=# CREATE STATISTICS asf ON (i) FROM t;
>>> CREATE STATISTICS
>>>
>>> It seems wrong that the command works with added parens, but builds expression
>>> stats on a simple column (which is redundant with what analyze does without
>>> extended stats).
>>
>> Well, yeah. But I think this is a behavior that was discussed somewhere in
>> this thread, and the agreement was that it's not worth the complexity, as
>> this comment explains
>>
>> * XXX We do only the bare minimum to separate simple attribute and
>> * complex expressions - for example "(a)" will be treated as a complex
>> * expression. No matter how elaborate the check is, there'll always be
>> * a way around it, if the user is determined (consider e.g. "(a+0)"),
>> * so it's not worth protecting against it.
>>
>> Patch 0001 fixes the "double parens" issue discussed elsewhere in this
>> thread, and patch 0002 tweaks CREATE STATISTICS to treat "(a)" as a simple
>> column reference.
>
> 0002 refuses to create expressional stats on a simple column reference like
> (a), which I think is helps to avoid a user accidentally creating useless ext
> stats objects (which are redundant with the table's column stats).
>
> 0002 does not attempt to refuse cases like (a+0), which I think is fine:
> we don't try to reject useless cases if someone insists on it.
> See 240971675, 701fd0bbc.
>
> So I am +1 to apply both patches.
>
> I added this as an Opened Item for increased visibility.
>
I've pushed both fixes, so the open item should be resolved.
However while polishing the second patch, I realized we're allowing
statistics on expressions referencing system attributes. So this fails;
CREATE STATISTICS s ON ctid, x FROM t;
but this passes:
CREATE STATISTICS s ON (ctid::text), x FROM t;
IMO we should reject such expressions, just like we reject direct
references to system attributes - patch attached.
Furthermore, I wonder if we should reject expressions without any Vars?
This works now:
CREATE STATISTICS s ON (11:text) FROM t;
but it seems rather silly / useless, so maybe we should reject it.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment | Content-Type | Size |
---|---|---|
0001-Reject-extended-statistics-referencing-system-attrib.patch | text/x-patch | 1.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Jaime Casanova | 2021-09-01 16:51:42 | Re: 2021-09 Commitfest |
Previous Message | Tom Lane | 2021-09-01 16:41:20 | Re: Is it safe to use the extended protocol with COPY? |