Re: Extended Statistics set/restore/clear functions.

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Tomas Vondra <tomas(at)vondra(dot)me>, pgsql-hackers(at)lists(dot)postgresql(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us
Subject: Re: Extended Statistics set/restore/clear functions.
Date: 2025-11-13 11:51:00
Message-ID: CACJufxFni1oB-VFuLbVs9VSgVx0czvoZyoW0z_tJ2kq4E3SXYw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

hi.
now looking at v12-0003-Add-working-input-function-for-pg_ndistinct.patch
again.

+ * example input:
+ * [{"attributes": [6, -1], "ndistinct": 14},
+ * {"attributes": [6, -2], "ndistinct": 9143},
+ * {"attributes": [-1,-2], "ndistinct": 13454},
+ * {"attributes": [6, -1, -2], "ndistinct": 14549}]
*/
Datum
pg_ndistinct_in(PG_FUNCTION_ARGS)

extenssted statistics surely won't work on system columns,
how should we deal with case like:
```
{"attributes": [6, -1], "ndistinct": 14}
{"attributes": [6, -7], "ndistinct": 14},
```
issue a warning or error out saying that your attribute number is invalid?
Should we discourage using system columns as examples in comments here?

I have added more test code in src/test/regress/sql/pg_ndistinct.sql,
to improve the code coverage.

as mentioned before in
https://postgr.es/m/CACJufxEZYqocFdgn-x-bJMRBSk_zkS=ziGGkaSumteiPDksnsg@mail.gmail.com
I think it's a good thing to change
``(errcode....``
to
``errcode``.
So I did the change.

+static JsonParseErrorType
+ndistinct_array_element_start(void *state, bool isnull)
+{
+ NDistinctParseState *parse = state;
+
+ switch(parse->state)
+ {
+ case NDIST_EXPECT_ATTNUM:
+ if (!isnull)
+ return JSON_SUCCESS;
+
+ ereturn(parse->escontext, (Datum) 0,
+ (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ errmsg("malformed pg_ndistinct: \"%s\"", parse->str),
+ errdetail("Attnum list elements cannot be null.")));

this (and many other places) looks wrong, because
ereturn would really return ``(Datum) 0``, and this function returns
JsonParseErrorType.
so we have to errsave here.

+typedef struct
+{
+ const char *str;
+ NDistinctSemanticState state;
+
+ List *distinct_items; /* Accumulated complete MVNDistinctItems */
+ Node *escontext;
+
+ bool found_attributes; /* Item has an attributes key */
+ bool found_ndistinct; /* Item has ndistinct key */
+ List *attnum_list; /* Accumulated attributes attnums */
+ int64 ndistinct;
+} NDistinctParseState;

+ case NDIST_EXPECT_NDISTINCT:
+ /*
+ * While the structure dictates that ndistinct in a double precision
+ * floating point, in practice it has always been an integer, and it
+ * is output as such. Therefore, we follow usage precendent over the
+ * actual storage structure, and read it in as an integer.
+ */
+ parse->ndistinct = pg_strtoint64_safe(token, parse->escontext);
+
+ if (SOFT_ERROR_OCCURRED(parse->escontext))
+ return JSON_SEM_ACTION_FAILED;

NDistinctParseState.ndistinct should be integer,
otherwise pg_ndistinct_out will not be consistent with pg_ndistinct_in?

SELECT '[{"attributes" : [1, 2], "ndistinct" :
2147483648}]'::pg_ndistinct; --error
pg_ndistinct
----------------------------------------------------
[{"attributes": [1, 2], "ndistinct": -2147483648}]
(1 row)

The result seems not what we expected.

Attachment Content-Type Size
v13-0001-refactor-v12-0003.no-cfbot application/octet-stream 17.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo Nagata 2025-11-13 12:00:33 Re: Suggestion to add --continue-client-on-abort option to pgbench
Previous Message Thomas Munro 2025-11-13 11:35:48 Re: Trying out <stdatomic.h>