Re: Extended Statistics set/restore/clear functions.

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Cc: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Extended Statistics set/restore/clear functions.
Date: 2026-01-22 06:41:36
Message-ID: aXHGoMAnzV5njUEo@paquier.xyz
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 21, 2026 at 03:42:21PM -0500, Corey Huinker wrote:
> I've added about 30 tests covering various things that can go wrong with
> extended stats parameter combinations, parameter contents, expression
> parameter combinations.
>
> There is one WARNING I can see that still needs coverage, but I'm at a loss
> as to how to set it up. It's the extended_stats.c:1202 "could not determine
> element type of expression" which should only happen if we have a datatype
> that appears to be an array but actually isn't...

This result is neater.

How about the multirange range case in extended_statistics_update()
for the mcv/expression path?

import_expressions() also complains about a statatt_set_slot() with
histograms.

> Also improved documentation, moved some code to import_mcv to behave like
> import_expressions, etc.

A couple of mostly-stylistic comments, while reading 0001.

+ errmsg("Expression %s element \"%s\" with is not a valid type \"%s\"",
[...]
+ errmsg("Could not use parameter \"%s\": expected text array of 2 dimensions",
Such error messages don't follow the project policy, this goes
actually backwards compared to v28. errmsg() should not be full
sentences, and no upper-case for the first character. This one should
be replaced with a "could not parse expression %s with element \"%s\":
invalid type \"%s\"". This comment applies to a lot of the new
changes of v29.

+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("Extended statitisics object %s does not expect %s statistics: ",
+ staqualname, "MCV"),
+ errhint("Cannot specify parameters \"%s\", \"%s\", \"%s\", or \"%s\".",
[...]
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("Extended statitisics object %s does not expect %s statistics: ",
+ staqualname, "ndistinct"),
+ errhint("Cannot specify parameter \"%s\".",

These ones are less consistent in style, the error hints should be the
errmsg, and there are some s/statitisics/statistics/. The errhint
should be a full sentence, so I guess that you mean to switch both
fields in such cases.

See the docs for reference about all that:
https://www.postgresql.org/docs/devel/error-style-guide.html

git diff --check is complaining a bit, but that's minor.

+ of type <type>name</type>, for the the table to which the statistics

s/the the/the/, I think this one's mine. :D

While the documentation shows one example with n_distinct,
dependencies and exprs, I'd guess that we should push forward with
something similar regarding most_common_val, most_common_val_nulls,
most_common_freqs, most_common_base_freqs. It looks particularly
important to expand this part, the relationship they have between each
pther, their expected input format (?), and what they map to in the
catalogs. If one is specified, all four of them are required, for
example, but that's not the only thing I imagine should keep a track
of. This data is more complex than the single stats fields.
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2026-01-22 06:51:29 Re: [PATCH v1] Document pg_get_partition_constraintdef.
Previous Message Zsolt Parragi 2026-01-22 06:32:20 Re: Use correct collation in pg_trgm