| From: | Corey Huinker <corey(dot)huinker(at)gmail(dot)com> |
|---|---|
| To: | Michael Paquier <michael(at)paquier(dot)xyz> |
| Cc: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Extended Statistics set/restore/clear functions. |
| Date: | 2026-01-29 06:20:10 |
| Message-ID: | CADkLM=ebOoi6xjLBmVtd+3Y=d4VxdBbpFY5_T2EobM58Ry=70g@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
>
> - v34 did not include a bunch of the fixes I've done in v33, like
> error messages tweaks, typo fixes, comment adjustments, and some test
> additions. So I had to come back to all that.
>
I'm sorry. I misinterpreted your follow-on inquiry about the need for the
mcv-nulls parameter to mean that you were walking back the need for the v33
patch.
Finally comes the final part of the proposal, as of v34-0002 for the
> restore of expression stats. I have been wondering for a couple of
> days if the proposed interface by relying on
> pg_restore_extended_stats() combined with scans of pg_stats_ext_exprs
> is the best thing we can do, and if a different approach with a
> different function would be better. I don't have a clean answer to
> this question yet, unfortunately.
>
I agree that the method of serializing the stats columns from
pg_stats_ext_exprs into a text[] is a bit opaque.
However, pg_stats_ext_exprs is the only means we have of fetching this
data, as it provides the security barrier that pg_statistic_ext_data does
not. Prior to my awareness of the security requirement, I had assumed that
I could just serialize stxdexpr to text and then unpack it
with deconstruct_array, but even that ran into problems with the stavalues
ANYARRAY at the end, wherein the type needed for the each row of
the stxdexpr array is different based on the object definition...so that
wasn't going anywhere even without the security requirement.
I could envision serializing pg_stats_ext_exprs to a kwargs-ish array
{"null_frac","0.4","avg_width","4","most_common_vals",...} which would at
least remove ambiguity about which string was meant for a given type of
stat unpacking, at a cost of some additional overhead. For that matter we
could collect then into a 1-level json of param_name:textified_value pairs,
but json inputs were not well received before.
If we went to a new function pg_restore_extended_expr_stats() which
specified all the fields from pg_stats_ext_exprs plus an array index value,
we'd still have some of the ANYARRAY issues necessitating casting the mcv,
histogram, and mcelems arrays to a scalar text value, but we'd also have to
address issues of the stxdexpr array elements being restored out of order
or missing a leading element.
All of that is a long way of explaining how the fairly opaque array of
text-casted values won out.
> What I am aware of though is that my schedule clock is unfortunately
> ticking, meaning that I am running out of time to do more here as I
> also need to focus on other things for this release.
Your time is a precious thing, and I thank you for all you've given of it
so far.
> We have achieved
> a lot already in terms of restore support for MCV, ndistinct and
> dependencies, and I am curious to see how this will work out during
> the beta cycle of v19 without the complexities related to expressions,
> so it may not be a bad thing to cool down and see how the situation
> evolves and if what's already been done requires any adjustments
>
If the arc of implementing attribute stats is any guide, we're likely to
get some feedback from the buildfarm in the coming days. I'll be watching
it.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Laurenz Albe | 2026-01-29 06:25:10 | Re: Add SECURITY_INVOKER_VIEWS option to CREATE DATABASE |
| Previous Message | shveta malik | 2026-01-29 06:07:42 | Re: Proposal: Conflict log history table for Logical Replication |