Re: Extended Statistics set/restore/clear functions.

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, pgsql-hackers(at)lists(dot)postgresql(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us
Subject: Re: Extended Statistics set/restore/clear functions.
Date: 2025-11-18 05:07:23
Message-ID: CACJufxGGzkb58BU+YyTa9cBAawhybwk2cPFZ1XupS-8xuAzN9A@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 17, 2025 at 2:56 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Fri, Nov 14, 2025 at 03:25:27PM +0900, Michael Paquier wrote:
> > Thanks for the new versions, I'll also look at all these across the
> > next couple of days. Probably not at 0005~ for now.
>
> 0001 and 0002 from series v13 have been applied to change the output
> functions.
>

> And I have looked at 0003 in details for now. Attached is a revised
> version for it, with many adjustments. Some notes:
> - Many portions of the coverage were missing. I have measured the
> coverage at 91% with the updated version attached. This includes
> coverage for some error reporting, something that we rely a lot on for
> this code.
> - The error reports are made simpler, with the token values getting
> hidden. While testing with some fancy values, I have actually noticed
> that the error handlings for the parsing of the int16 and int32 values
> were incorrect, the error reports used what the safe functions
> generated, not the reports from the data type.
> - Passing down arbitrary bytes sequences was leading to these bytes
> reported in the error outputs because we cared about the token values.
> I have added a few tests based on that for the code paths involved.
>
hi.

in src/backend/statistics/mvdistinct.c, we have:
Assert(AttributeNumberIsValid(item->attributes[j]));

should we disallow 0 in key attributes?
SELECT '[{"attributes" : [0,1], "ndistinct" : 4}]'::pg_ndistinct;
I didn't find a way to trigger this Assert yet.

+ errsave(parse->escontext,
+ errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ errmsg("malformed pg_ndistinct: \"%s\"", parse->str),
+ errdetail("Invalid \"%s\" value.", PG_NDISTINCT_KEY_ATTRIBUTES));

+ errsave(parse->escontext,
+ errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ errmsg("malformed pg_ndistinct: \"%s\"", parse->str),
+ errdetail("Invalid \"%s\" value.",
+ PG_NDISTINCT_KEY_NDISTINCT));

the errdetail is way too generic?
similar to ``select 'a'::int;``
we can
DETAIL: Invalid input syntax for type integer: "a"
HINT: "ndistinct" value expected to be a type of integer.

what do you think?

we already have "fname" in ndistinct_object_field_start,
we can also print out the "fname", like:
errsave(parse->escontext,
errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
errmsg("malformed pg_ndistinct: \"%s\"", parse->str),
errdetail("Unexpected key \"%s\"", fname),
errhint("Only allowed keys are \"%s\" and \"%s\".",
PG_NDISTINCT_KEY_ATTRIBUTES,
PG_NDISTINCT_KEY_NDISTINCT));

SELECT '[{"attributes" : [2,3], "ndistinct" : 4, "ndistinct" :
14}]'::pg_ndistinct;
pg_ndistinct
-------------------------------------------
[{"attributes": [2, 3], "ndistinct": 14}]

SELECT '[{"attributes" : [2,3], "ndistinct" : 4, "attributes" :
[]}]'::pg_ndistinct;
pg_ndistinct
------------------------------------------
[{"attributes": [2, 3], "ndistinct": 4}]

Is the above output what we expected?

+ /*
+ * We need at least two attribute numbers for a ndistinct item, anything
+ * less is malformed.
+ */
+ natts = parse->attnum_list->length;
here, we can use list_length.

+ if (parse->attnum_list != NIL)
+ if (parse->distinct_items != NIL)
here, we can also use list_length.

--
jian
https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2025-11-18 05:31:12 Re: CREATE/ALTER PUBLICATION improvements for syntax synopsis
Previous Message Chao Li 2025-11-18 05:05:52 Re: Row pattern recognition