Re: pg_stats and range statistics

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, Egor Rogov <e(dot)rogov(at)postgrespro(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pg_stats and range statistics
Date: 2023-11-24 23:06:24
Message-ID: CAPpHfduZ3hApO-GFT+Rtn75w3gR8qhcw+-xGENjHAANFMA5xKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

On Wed, Sep 6, 2023 at 6:18 PM jian he <jian(dot)universality(at)gmail(dot)com> wrote:
> + <literal>lower(ARRAY[numrange(1.1,2.2),numrange(3.3,4.4)])</literal>
> should be
> + <literal>ranges_lower(ARRAY[numrange(1.1,2.2),numrange(3.3,4.4)])</literal>
>
> + <literal>upper(ARRAY[numrange(1.1,2.2),numrange(3.3,4.4)])</literal>
> should be
> + <literal>ranges_upper(ARRAY[numrange(1.1,2.2),numrange(3.3,4.4)])</literal>
>
> https://www.postgresql.org/docs/current/catalog-pg-type.html
> there is no association between numrange and their base type numeric.
> so for template: anyarray ranges_lower(anyarray). I don't think we can
> input numrange array and return a numeric array.
>
> https://www.postgresql.org/docs/current/extend-type-system.html#EXTEND-TYPES-POLYMORPHIC
> >> When the return value of a function is declared as a polymorphic type, there must be at least one argument position that is also >> polymorphic, and the actual data type(s) supplied for the polymorphic arguments determine the actual result type for that call.
>
>
> regression=# select
> ranges_lower(ARRAY[numrange(1.1,2.2),numrange(3.3,4.4),
> numrange(5.5,6.6)]);
> ranges_lower
> ---------------
> {1.1,3.3,5.5}
> (1 row)
> regression=# \gdesc
> Column | Type
> --------------+------------
> ranges_lower | numrange[]
> (1 row)
>
> I don't think you can cast literal ' {1.1,3.3,5.5}' to numrange[].

Thank you for noticing this. Indeed, our polymorphic type system
doesn't support this case. In order to support this, we need
something like "anyrangearray" pseudo-type. However, it seems
overkill to introduce a new pseudo-type just to update pg_stats.

Additionally, I found that the current patch can't handle infinite
range bounds and discards information about inclusiveness of range
bounds. The infinite bounds could be represented as NULL (while I'm
not sure how good this representation is). Regarding inclusiveness, I
don't see the possibility to represent them in a reasonable way within
an array of base types. I also don't feel good about discarding the
accuracy in the pg_stats view.

In conclusion of all of the above, I decided to revise the patch and
show the bounds histogram as it's stored in pg_statistic. I revised
the docs correspondingly.

Also for some reason, the patch added description of new columns to
the documentation of pg_user_mapping table. I've fixed that by moving
them to the documentation of pg_stats view.

Also, I've extracted the new comment in pg_statistic.h into a separate patch.

I'm going to push this if there are no objections.

------
Regards,
Alexander Korotkov

Attachment Content-Type Size
0001-Add-comment-to-pg_statistic-catalog-table-20231125.patch application/octet-stream 1.3 KB
0002-Display-length-and-bounds-histograms-in-pg_-20231125.patch application/octet-stream 5.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2023-11-24 23:19:16 Re: Relation bulk write facility
Previous Message Bruce Momjian 2023-11-24 23:02:12 Re: remove deprecated @@@ operator ?