| From: | Michael Paquier <michael(at)paquier(dot)xyz> |
|---|---|
| To: | Corey Huinker <corey(dot)huinker(at)gmail(dot)com> |
| Cc: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Extended Statistics set/restore/clear functions. |
| Date: | 2026-01-25 01:24:48 |
| Message-ID: | aXVw4LGO4fWunn4n@paquier.xyz |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Fri, Jan 23, 2026 at 01:45:38AM -0500, Corey Huinker wrote:
> I think some of this may have been addressed in this patchset.
>
> I've left these patches un-squashed to help differentiate what changes
> address which problems. They'll likely be re-squashed once those issues are
> addressed.
No problem here for the moment.
> The most interesting here is the change I did to statatt_get_type() to get
> it to work for MCV/expression multiranges. Mostly, the problem arose from
> the fact that MCVlist will have multirange as a part of the exported tuple
> text array, so there is a legit need to parse the input of a multirange's
> text reprepresentation, whereas with attribute stats we only ever need to
> parse the input of the multirange's element, a plain old range. The fix, as
> you may see highlights how statatt_get_type() is doing two things
> (resolving typ* info for an attribute or expression, and fetching the
> basetype and basetype's ltopr and eqopr) and maybe the basetype/operator
> stuff should be moved to its own function.
>
> Another possible solution is to change the two refactor the
> examine_attribute() functions (one in analyze.c, one in extended_stats.c)
> unifying them where possible, and allowing the unified function to still
> generate a VacAttrStats even if the attstattarget is 0. That's a heavier
> lift, and I left things as they are now to demonstrate the issue at hand.
Yes, we are entering into some complexity territory here that needs a
very careful lookup, and I am not sure yet if your suggestion of
input arguments is the best thing to do.
I have been thinking about how to move this patch forward, and I think
that we should split things a bit more the logic (aka my usual
divide-and-conquer strategy) for the restore function, because we
should make sure that each part is sound before merging, and that's
also a risk reduction if we need to revert one part or the other:
1) One patch for ndistinct.
2) One patch for dependencies.
3) One patch for MVC.
4) Dump/restore.
The "exprs" input argument is required by all the three others, meaning
that it needs to stand in the first bits of the patch. The order of
ndistinct and dependencies does not matter, let's just do these based
on the order of the input arguments. Dump/restore could be already
integrated even if we have only 1) and 2) restore integrated in the
tree. It also seems like the "exprs" argument should stand on top of
all the others, as well, so it would make more sense to have it after
"inherited" in extended_stats_argnum?
There are two reasons behind this set of thoughts:
1) The regression tests are bloated. We rely currently on one single
extended stats object that holds all the types of stats (MCV,
dependencies, ndistinct), meaning that for some of the valid cases we
need to cary all the types in the input of restore function. It's
also a bit true in terms of the invalid cases. One thing that I would
recommend to do here is create two simpler extstat objects for each
stxkind (one with expression, one without expression). You should
still require the "global" extstat object that includes all the
stxkind, which counts for the expression import path, of course.
2) Each stxkind has its own set of paths in terms of input and
validation. ndistinct and dependencies are much easier to handle as
they have their own input function, hence they're better handled
first.
--
Michael
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2026-01-25 01:26:33 | Re: Add WALRCV_CONNECTING state to walreceiver |
| Previous Message | Florents Tselai | 2026-01-25 01:15:39 | Re: Add SQL/JSON ON MISMATCH clause to JSON_VALUE |