Re: PoC/WIP: Extended statistics on expressions

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PoC/WIP: Extended statistics on expressions
Date: 2021-03-26 12:54:13
Message-ID: 096d26e1-f053-a039-047d-a050dbb80ac6@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/26/21 12:37 PM, Dean Rasheed wrote:
> On Thu, 25 Mar 2021 at 19:59, Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>
>> Attached is an updated patch series, with all the changes discussed
>> here. I've cleaned up the ndistinct stuff a bit more (essentially
>> reverting back from GroupExprInfo to GroupVarInfo name), and got rid of
>> the UpdateStatisticsForTypeChange.
>>
>
> I've looked over all that and I think it's in pretty good shape. I
> particularly like how much simpler the ndistinct code has now become.
>
> Some (hopefully final) review comments:
>
> 1). I don't think index.c is the right place for
> StatisticsGetRelation(). I appreciate that it is very similar to the
> adjacent IndexGetRelation() function, but index.c is really only for
> index-related code, so I think StatisticsGetRelation() should go in
> statscmds.c
>

Ah, right, I forgot about this. I wonder if we should add
catalog/statistics.c, similar to catalog/index.c (instead of adding it
locally to statscmds.c).

> 2). Perhaps the error message at statscmds.c:293 should read
>
> "expression cannot be used in multivariate statistics because its
> type %s has no default btree operator class"
>
> (i.e., add the word "multivariate", since the same expression *can* be
> used in univariate statistics even though it has no less-than
> operator).
>
> 3). The comment for ATExecAddStatistics() should probably mention that
> "ALTER TABLE ADD STATISTICS" isn't a command in the grammar, in a
> similar way to other similar functions, e.g.:
>
> /*
> * ALTER TABLE ADD STATISTICS
> *
> * This is no such command in the grammar, but we use this internally to add
> * AT_ReAddStatistics subcommands to rebuild extended statistics after a table
> * column type change.
> */
>
> 4). The comment at the start of ATPostAlterTypeParse() needs updating
> to mention CREATE STATISTICS statements.
>
> 5). I think ATPostAlterTypeParse() should also attempt to preserve any
> COMMENTs attached to statistics objects, i.e., something like:
>
> --- src/backend/commands/tablecmds.c.orig 2021-03-26 10:39:38.328631864 +0000
> +++ src/backend/commands/tablecmds.c 2021-03-26 10:47:21.042279580 +0000
> @@ -12619,6 +12619,9 @@
> CreateStatsStmt *stmt = (CreateStatsStmt *) stm;
> AlterTableCmd *newcmd;
>
> + /* keep the statistics object's comment */
> + stmt->stxcomment = GetComment(oldId, StatisticExtRelationId, 0);
> +
> newcmd = makeNode(AlterTableCmd);
> newcmd->subtype = AT_ReAddStatistics;
> newcmd->def = (Node *) stmt;
>
> 6). Comment typo at extended_stats.c:2532 - s/statitics/statistics/
>
> 7). I don't think that the big XXX comment near the start of
> estimate_multivariate_ndistinct() is really relevant anymore, now that
> the code has been simplified and we no longer extract Vars from
> expressions, so perhaps it can just be deleted.
>

Thanks! I'll fix these, and then will consider getting it committed
sometime later today, once the buildfarm does some testing on the other
stuff I already committed.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2021-03-26 12:57:37 Re: WIP: BRIN multi-range indexes
Previous Message Alvaro Herrera 2021-03-26 12:43:27 Re: [PATCH] pg_permissions