pgsql: Add input function for data type pg_ndistinct

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Add input function for data type pg_ndistinct
Date: 2025-11-26 01:16:15
Message-ID: E1vO48k-001QDn-1r@gemulon.postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Add input function for data type pg_ndistinct

pg_ndistinct is used as data type for the contents of ndistinct extended
statistics. This new input function consumes the format that has been
established by 1f927cce4498 for the output function of pg_ndistinct,
enforcing some sanity checks for:
- Checks for the input object, which should be a one-dimension array
with correct attributes and values.
- The key names: "attributes", "ndistinct". Both are required, other
key names are blocked.
- Value types for each key: "attributes" requires an array of integers,
and "ndistinct" an integer.
- List of attributes. Note that this enforces a check so as an
attribute list has to be a subset of the longest attribute list found.
This does not enforce that a full group of attribute sets exist, based
on how the groups are generated when the ndistinct objects are
generated, making the list of ndistinct items a bit loose. Note a check
would still be required at import to see if the attributes listed match
with the attribute numbers set in the definition of a statistics object.
- Based on the discussion, the checks on the values are loose, as there
is also an argument for potentially stats injection. The relation and
attribute level stats follow the same line of argument for the values.

This is required for a follow-up patch that aims to implement the import
of extended statistics. Some tests are added to check the code paths of
the JSON parser checking the shape of the pg_ndistinct inputs, with 90%
of code coverage reached. The tests are located in their own new test
file, for clarity.

Author: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Reviewed-by: Jian He <jian(dot)universality(at)gmail(dot)com>
Reviewed-by: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
Reviewed-by: Michael Paquier <michael(at)paquier(dot)xyz>
Reviewed-by: Yuefei Shi <shiyuefei1004(at)gmail(dot)com>
Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/44eba8f06e5568be35fa3d112ab781e931fe04ae

Modified Files
--------------
src/backend/utils/adt/pg_ndistinct.c | 768 ++++++++++++++++++++++++++++-
src/test/regress/expected/pg_ndistinct.out | 447 +++++++++++++++++
src/test/regress/parallel_schedule | 2 +-
src/test/regress/sql/pg_ndistinct.sql | 106 ++++
src/tools/pgindent/typedefs.list | 2 +
5 files changed, 1316 insertions(+), 9 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Michael Paquier 2025-11-26 02:01:00 pgsql: Add input function for data type pg_dependencies
Previous Message Robert Haas 2025-11-26 00:13:03 Re: pgsql: Teach DSM registry to ERROR if attaching to an uninitialized ent