Re: pg_stats and range statistics

From: Egor Rogov <e(dot)rogov(at)postgrespro(dot)ru>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pg_stats and range statistics
Date: 2023-03-24 18:48:09
Message-ID: 80e37c96-bdb4-dd00-b2da-5a01366f685b@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 24.03.2023 01:46, Tomas Vondra wrote:

>
> So if you could clean it up a bit, and do something about the two open
> items I mentioned (a bunch of tests on different array,

I've added some tests to resgress/sql/rangetypes.sql, based on the same
dataset that is used to test lower() and upper().

> and behavior
> consistent with lower/upper),

Done. This required to switch from construct_array(), which doesn't
support NULLs, to construct_md_array(), which does. A nice side effect
is that now we also support multidimentional arrays.

I've moved a common part of ranges_lower_bounds() and
ranges_upper_bounds() to ranges_bounds_common(), following Justin's advice.

There is one thing I'm not sure what to do about. This check:

     if (typentry->typtype != TYPTYPE_RANGE)
         ereport(ERROR,
                 (errcode(ERRCODE_DATATYPE_MISMATCH),
                  errmsg("expected array of ranges")));

doesn't work, because the range_get_typcache() call errors out first
("type %u is not a range type"). The message doesn't look friendly
enough for user-faced SQL function. Should we duplicate
range_get_typcache's logic and replace the error message?

> that'd be great.
>
>> Do we stick with the ranges_upper(anyarray) and ranges_lower(anyarray)
>> functions? This approach is okay with me. Tomas, have you made up your
>> mind?
>>
> I think the function approach is fine, but in my January 22 message I
> was wondering why we're not actually naming them simply lower/upper.

I'd expect from lower(anyarray) function to return the lowest element in
the array. This name doesn't hint that the function takes an array of
ranges. So, ranges_ prefix seems justified to me.

>
>> Do we want to document these functions? They are very
>> pg_statistic-specific and won't be useful for end users imo.
>>
> I don't see why not to document them. Sure, we're using them in a fairly
> specific context, but I don't see why not to let people use them too
> (which would be hard without docs).

Okay. I've corrected the examples a bit.

The patch is attached.

Thanks,
Egor

Attachment Content-Type Size
pgstats_20230324.patch text/plain 15.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2023-03-24 19:23:20 cfbot stuck
Previous Message Robert Haas 2023-03-24 18:36:50 Re: running logical replication as the subscription owner