| From: | Corey Huinker <corey(dot)huinker(at)gmail(dot)com> |
|---|---|
| To: | jian he <jian(dot)universality(at)gmail(dot)com> |
| Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Tomas Vondra <tomas(at)vondra(dot)me>, pgsql-hackers(at)lists(dot)postgresql(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us |
| Subject: | Re: Extended Statistics set/restore/clear functions. |
| Date: | 2025-11-14 05:49:23 |
| Message-ID: | CADkLM=d_mmTn3+ChL-z_ZomwOKT-BssQU+=6_a8WDv6A4b7uAA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
>
> extenssted statistics surely won't work on system columns,
> how should we deal with case like:
> ```
> {"attributes": [6, -1], "ndistinct": 14}
> {"attributes": [6, -7], "ndistinct": 14},
> ```
> issue a warning or error out saying that your attribute number is invalid?
> Should we discourage using system columns as examples in comments here?
>
Negative numbers represent the Nth expression defined in the extended
statistics object. So if you have extended statistics on a, b, length(a),
length(b) then you can legally have -1 and -2 in the attributes, but
nothing lower than that.
See functions pg_ndistinct_validate_items() and
pg_depdendencies_validate_deps() as these check the attributes in the value
against the definition of the extended stats object.
Though this does bring up a small naming problem: Elements of a
pg_dependencies are Dependency, abbreviated to dep, but are also called
Items like the elements in a pg_ndistinct. We should pick a standard name
for such things (probably item) and use it everywhere.
>
> I have added more test code in src/test/regress/sql/pg_ndistinct.sql,
> to improve the code coverage.
>
I'm trying to implement those test cases, but I may have missed some.
>
>
> this (and many other places) looks wrong, because
> ereturn would really return ``(Datum) 0``, and this function returns
> JsonParseErrorType.
> so we have to errsave here.
>
Good point. Implemented.
>
> NDistinctParseState.ndistinct should be integer,
> otherwise pg_ndistinct_out will not be consistent with pg_ndistinct_in?
>
+1
Implemented many, but not all of these suggestions.
| Attachment | Content-Type | Size |
|---|---|---|
| v13-0001-Refactor-output-format-of-pg_ndistinct.patch | text/x-patch | 17.2 KB |
| v13-0002-Refactor-output-format-of-pg_dependencies.patch | text/x-patch | 12.1 KB |
| v13-0003-Add-working-input-function-for-pg_ndistinct.patch | text/x-patch | 29.3 KB |
| v13-0004-Add-working-input-function-for-pg_dependencies.patch | text/x-patch | 34.3 KB |
| v13-0005-Expose-attribute-statistics-functions-for-use-in.patch | text/x-patch | 10.7 KB |
| v13-0006-Add-extended-statistics-support-functions.patch | text/x-patch | 166.2 KB |
| v13-0007-Include-Extended-Statistics-in-pg_dump.patch | text/x-patch | 13.6 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2025-11-14 06:04:25 | Re: pgsql: Drop unnamed portal immediately after execution to completion |
| Previous Message | Japin Li | 2025-11-14 05:49:07 | Re: [PATCH] Add pg_get_database_ddl() function to reconstruct CREATE DATABASE statement |