| From: | Junwang Zhao <zhjwpku(at)gmail(dot)com> |
|---|---|
| To: | Corey Huinker <corey(dot)huinker(at)gmail(dot)com> |
| Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Extended Statistics set/restore/clear functions. |
| Date: | 2026-01-24 15:26:54 |
| Message-ID: | CAEG8a3+6ibpVjnGiw6AWcM4roMYoge=aOuw1JP7RV5Wv+q4xvA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Fri, Jan 23, 2026 at 2:46 PM Corey Huinker <corey(dot)huinker(at)gmail(dot)com> wrote:
>>
>> How about the multirange range case in extended_statistics_update()
>> for the mcv/expression path?
>
>
> Added.
>
>>
>> import_expressions() also complains about a statatt_set_slot() with
>> histograms.
>
>
> Unsure what you mean here.
>
>>
>> These ones are less consistent in style, the error hints should be the
>> errmsg, and there are some s/statitisics/statistics/. The errhint
>> should be a full sentence, so I guess that you mean to switch both
>> fields in such cases.
>
>
> The 0003 patch of this set addresses this.
>
>>
>> s/the the/the/, I think this one's mine. :D
>
>
> 0002
>
>>
>>
>> While the documentation shows one example with n_distinct,
>> dependencies and exprs, I'd guess that we should push forward with
>> something similar regarding most_common_val, most_common_val_nulls,
>> most_common_freqs, most_common_base_freqs. It looks particularly
>> important to expand this part, the relationship they have between each
>> pther, their expected input format (?), and what they map to in the
>> catalogs. If one is specified, all four of them are required, for
>> example, but that's not the only thing I imagine should keep a track
>> of. This data is more complex than the single stats fields.
>
>
> I think some of this may have been addressed in this patchset.
>
> I've left these patches un-squashed to help differentiate what changes address which problems. They'll likely be re-squashed once those issues are addressed.
>
> The most interesting here is the change I did to statatt_get_type() to get it to work for MCV/expression multiranges. Mostly, the problem arose from the fact that MCVlist will have multirange as a part of the exported tuple text array, so there is a legit need to parse the input of a multirange's text reprepresentation, whereas with attribute stats we only ever need to parse the input of the multirange's element, a plain old range. The fix, as you may see highlights how statatt_get_type() is doing two things (resolving typ* info for an attribute or expression, and fetching the basetype and basetype's ltopr and eqopr) and maybe the basetype/operator stuff should be moved to its own function.
>
> Another possible solution is to change the two refactor the examine_attribute() functions (one in analyze.c, one in extended_stats.c) unifying them where possible, and allowing the unified function to still generate a VacAttrStats even if the attstattarget is 0. That's a heavier lift, and I left things as they are now to demonstrate the issue at hand.
>
> Rebased as well.
>
v30-0003-normalize-error-messages.patch introduced a typo:
%s/incorect/incorrect
--
Regards
Junwang Zhao
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andres Freund | 2026-01-24 15:31:01 | Re: unnecessary executor overheads around seqscans |
| Previous Message | Álvaro Herrera | 2026-01-24 14:50:28 | unclear OAuth error message |