| From: | Michael Paquier <michael(at)paquier(dot)xyz> |
|---|---|
| To: | Corey Huinker <corey(dot)huinker(at)gmail(dot)com> |
| Cc: | Tomas Vondra <tomas(at)vondra(dot)me>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us |
| Subject: | Re: Extended Statistics set/restore/clear functions. |
| Date: | 2025-11-07 22:56:48 |
| Message-ID: | aQ55MDBjbO8_0fnv@paquier.xyz |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Fri, Nov 07, 2025 at 05:28:50PM -0500, Corey Huinker wrote:
> I'm open to other formats, but aside from renaming the json keys (maybe
> "attnums" or "keys" instead of "attributes"?), I'm not sure what really
> could be done and still be JSON. I suppose we could go with a tuple format
> like this:
>
> '{({3,4},11),...}' for pg_ndistinct and
> '{({3},4,1.00000),...}' for pg_dependencies.
>
> Those would certainly be more compact, but makes for a hard read by humans,
> and while the JSON code is big, it's also proven in other parts of the
> codebase, hence less risky.
I've liked the human-readability factor of the format in the current
patches with names in the keys, and values assigned to each property.
Another thing that may be worth doing is pushing the names of the keys
and some its the JSON meta-data shaping the object into a new header
than can be loaded by both the backend and the frontend. It would be
nice to not hardcode this knowledge in a bunch of places if we finish
by renaming these attributes.
> A part of me thinks that everything that remains after removing
> in/out/send/recv is just taking a table sample data structure and crunching
> numbers to come up with the deserialized data structure...that's in/out
> with a different starting/ending points.
>
> There's no denying that JSON parsing is a very different code style than
> statistical number crunching, and mixing the two is incongruous, so it's
> worth a shot, and I'll try that for v9.
Yeah, right. Thanks. The parsing pieces seem like pieces worth their
own file.
> The functions in question are needed because the exprs value is itself an
> array of partly-filled-out pg_attribute tuples, so it's common to those two
> needs, but specific to stats about attributes. Maybe we need an
> attr_stats_utils.h?
Hmm, maybe. I'd be OK to revisit these structures once we're happy
with the in/out structures. That would be a good start point before
working on the SQL functions and the dump/restore bits in more
details.
--
Michael
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2025-11-07 23:02:22 | Re: Sequence Access Methods, round two |
| Previous Message | Rahila Syed | 2025-11-07 22:55:42 | Re: Enhancing Memory Context Statistics Reporting |