Re: Extended Statistics set/restore/clear functions.

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Extended Statistics set/restore/clear functions.
Date: 2026-01-23 06:45:38
Message-ID: CADkLM=c3JivzHNXLt-X_JicYknRYwLTiOCHOPiKagm2_vdrFUg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> How about the multirange range case in extended_statistics_update()
> for the mcv/expression path?
>

Added.

> import_expressions() also complains about a statatt_set_slot() with
> histograms.
>

Unsure what you mean here.

> These ones are less consistent in style, the error hints should be the
> errmsg, and there are some s/statitisics/statistics/. The errhint
> should be a full sentence, so I guess that you mean to switch both
> fields in such cases.
>

The 0003 patch of this set addresses this.

> s/the the/the/, I think this one's mine. :D
>

0002

>
> While the documentation shows one example with n_distinct,
> dependencies and exprs, I'd guess that we should push forward with
> something similar regarding most_common_val, most_common_val_nulls,
> most_common_freqs, most_common_base_freqs. It looks particularly
> important to expand this part, the relationship they have between each
> pther, their expected input format (?), and what they map to in the
> catalogs. If one is specified, all four of them are required, for
> example, but that's not the only thing I imagine should keep a track
> of. This data is more complex than the single stats fields.
>

I think some of this may have been addressed in this patchset.

I've left these patches un-squashed to help differentiate what changes
address which problems. They'll likely be re-squashed once those issues are
addressed.

The most interesting here is the change I did to statatt_get_type() to get
it to work for MCV/expression multiranges. Mostly, the problem arose from
the fact that MCVlist will have multirange as a part of the exported tuple
text array, so there is a legit need to parse the input of a multirange's
text reprepresentation, whereas with attribute stats we only ever need to
parse the input of the multirange's element, a plain old range. The fix, as
you may see highlights how statatt_get_type() is doing two things
(resolving typ* info for an attribute or expression, and fetching the
basetype and basetype's ltopr and eqopr) and maybe the basetype/operator
stuff should be moved to its own function.

Another possible solution is to change the two refactor the
examine_attribute() functions (one in analyze.c, one in extended_stats.c)
unifying them where possible, and allowing the unified function to still
generate a VacAttrStats even if the attstattarget is 0. That's a heavier
lift, and I left things as they are now to demonstrate the issue at hand.

Rebased as well.

Attachment Content-Type Size
v30-0001-Add-pg_restore_extended_stats.patch text/x-patch 144.1 KB
v30-0002-the-the-the.patch text/x-patch 1.3 KB
v30-0003-normalize-error-messages.patch text/x-patch 32.4 KB
v30-0004-Avoid-multirange-step-down-in-statatt_get_type-w.patch text/x-patch 4.3 KB
v30-0005-Add-tests-for-multirange-stats.patch text/x-patch 33.4 KB
v30-0006-Include-Extended-Statistics-in-pg_dump.patch text/x-patch 14.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Previous Message Antonin Houska 2026-01-23 06:33:32 Re: Race conditions in logical decoding