| From: | Corey Huinker <corey(dot)huinker(at)gmail(dot)com> |
|---|---|
| To: | Michael Paquier <michael(at)paquier(dot)xyz> |
| Cc: | Tomas Vondra <tomas(at)vondra(dot)me>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us |
| Subject: | Re: Extended Statistics set/restore/clear functions. |
| Date: | 2025-11-12 06:47:33 |
| Message-ID: | CADkLM=fv9sszLWJJvbG1meOvHEueN0fubHr2hGQZOYVfv5rwzQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
>
> Thanks for the new patch. And FWIW I disagree with this approach:
> cleanup and refactoring pieces make more sense if done first, as these
> lead to less code churn in the final result. So... I've begun to put
> my hands on the patch set. The whole has been restructured a bit, as
> per the attached. Patch 0001 to 0004 feel OK here, these include two
> code moves and the two output functions:
> - Two new files for adt/, that I'm planning to apply soon as a
> separate cleanup.
> - New output functions, with keys added to a new header named
> statistics_format.h, for frontend and backend consumption.
>
Agreed, 0001-0004 all look good.
> Next comes the input functions. First, I am unhappy with the amount
> of testing that has been put into ndistinct, first and only input
> facility I've looked at in details for the moment. I have quickly
> spotted a couple a few issues while testing buggy input, like this one
> that crashes on pointer dereference, not good obviously:
> SELECT '[]'::pg_ndistinct;
>
- I put some work into more specific error messages for invalid values for
both pg_ndistinct and pg_dependencies.
- The check for empty attribute lists and item lists now occur in the
array-end event handler.
- Also tried to standardize conventions between the two data types (switch
statements, similar utility functions, etc).
>
> These are checked in the patches that introduce the functions like
> with pg_ndistinct_validate_items(), based on the list of stxkeys we
> have. However, I think that this is not enough by itself. Shouldn't
> we check that the list of items in the array is what we expect based
> on the longest "attributes" array at least, even after a JSON that was
> parsed? That would be cheap to check in the output function itself,
> at least as a first layer of checks before trying something with the
> import function and cross-checking the list of attributes for the
> extended statistics object.
I added tests for both duplicate attribute sequences as well as making the
first-longest attribute sequence the template by which all later and
shorter sequences are checked. I had been reluctant to add checks like
this, because so many similar validations were removed from the earlier
statistics code like histograms and the like.
>
> I suspect a similar family of issues with pg_dependencies, and it
> would be nice to move the tests with the input function into a new
> regression file, like the other one.
>
Did so.
0001-0004,0007,0009 unchanged.
Significant modification of the stats_import.sql regression tests in 0008
to conform to stricter datatype rules enacted in 0005, 0006.
| Attachment | Content-Type | Size |
|---|---|---|
| v11-0001-Make-pg_ndinstinct-a-proper-adt.patch | text/x-patch | 6.6 KB |
| v11-0002-Make-pg_dependencies-a-proper-adt.patch | text/x-patch | 7.7 KB |
| v11-0003-Refactor-output-format-of-pg_ndistinct.patch | text/x-patch | 17.0 KB |
| v11-0004-Refactor-output-format-of-pg_dependencies.patch | text/x-patch | 11.1 KB |
| v11-0005-Add-working-input-function-for-pg_ndistinct.patch | text/x-patch | 25.8 KB |
| v11-0006-Add-working-input-function-for-pg_dependencies.patch | text/x-patch | 30.4 KB |
| v11-0007-Expose-attribute-statistics-functions-for-use-in.patch | text/x-patch | 10.7 KB |
| v11-0008-Add-extended-statistics-support-functions.patch | text/x-patch | 166.2 KB |
| v11-0009-Include-Extended-Statistics-in-pg_dump.patch | text/x-patch | 13.6 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | shveta malik | 2025-11-12 06:50:55 | Re: Proposal: Conflict log history table for Logical Replication |
| Previous Message | Tom Lane | 2025-11-12 06:38:47 | Re: BUG #19095: Test if function exit() is used fail when linked static |