Re: PoC/WIP: Extended statistics on expressions

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PoC/WIP: Extended statistics on expressions
Date: 2021-09-19 23:07:23
Message-ID: 57ad73f5-a979-5b6c-10e4-6f4ce15fce6b@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 9/3/21 5:56 AM, Justin Pryzby wrote:
> On Wed, Sep 01, 2021 at 06:45:29PM +0200, Tomas Vondra wrote:
>> However while polishing the second patch, I realized we're allowing
>> statistics on expressions referencing system attributes. So this fails;
>>
>> CREATE STATISTICS s ON ctid, x FROM t;
>>
>> but this passes:
>>
>> CREATE STATISTICS s ON (ctid::text), x FROM t;
>>
>> IMO we should reject such expressions, just like we reject direct references
>> to system attributes - patch attached.
>
> Right, same as indexes. +1
>

I've pushed this check, disallowing extended stats on expressions
referencing system attributes. This means we'll reject both ctid and
(ctid::text), just like for indexes.

>> Furthermore, I wonder if we should reject expressions without any Vars? This
>> works now:
>>
>> CREATE STATISTICS s ON (11:text) FROM t;
>>
>> but it seems rather silly / useless, so maybe we should reject it.
>
> To my surprise, this is also allowed for indexes...
>
> But (maybe this is what I was remembering) it's prohibited to have a constant
> expression as a partition key.
>
> Expressions without a var seem like a case where the user did something
> deliberately silly, and dis-similar from the case of making a stats expression
> on a simple column - that seemed like it could be a legitimate
> mistake/confusion (it's not unreasonable to write an extra parenthesis, but
> it's strange if that causes it to behave differently).
>
> I think it's not worth too much effort to prohibit this: if they're determined,
> they can still write an expresion with a var which is constant. I'm not going
> to say it's worth zero effort, though.
>

I've decided not to push this. The statistics objects on expressions not
referencing any variables seem useless, but maybe not entirely - we
allow volatile expressions, like

CREATE STATISTICS s ON (random()) FROM t;

which I suppose might be useful. And we reject similar cases (except for
the volatility, of course) for indexes too.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message A Z 2021-09-20 01:40:18 Improved PostgreSQL Mathematics Support.
Previous Message Jonathan S. Katz 2021-09-19 21:45:32 Re: Release 14 Schedule