Re: pg_stats and range statistics

From: Egor Rogov <e(dot)rogov(at)postgrespro(dot)ru>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pg_stats and range statistics
Date: 2023-01-23 10:01:46
Message-ID: d4d12fd0-b6e0-4cc7-649f-09690ff72f09@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 23.01.2023 02:21, Tomas Vondra wrote:
>
> On 1/22/23 22:33, Justin Pryzby wrote:
>> On Sun, Jan 22, 2023 at 07:19:41PM +0100, Tomas Vondra wrote:
>>> On 1/21/23 19:53, Egor Rogov wrote:
>>>> Hi Tomas,
>>>> On 21.01.2023 00:50, Tomas Vondra wrote:
>>>>> This simply adds two functions, accepting/producing anyarray - one for
>>>>> lower bounds, one for upper bounds. I don't think it can be done with a
>>>>> plain subquery (or at least I don't know how).
>>>> Anyarray is an alien to SQL, so functions are well justified here. What
>>>> makes me a bit uneasy is two almost identical functions. Should we
>>>> consider other options like a function with an additional parameter or a
>>>> function returning an array of bounds arrays (which is somewhat
>>>> wasteful, but probably it doesn't matter much here)?
>>>>
>>> I thought about that, but I think the alternatives (e.g. a single
>>> function with a parameter determining which boundary to return). But I
>>> don't think it's better.
>> What about a common function, maybe called like:
>>
>> ranges_upper_bounds(PG_FUNCTION_ARGS)
>> {
>> AnyArrayType *array = PG_GETARG_ANY_ARRAY_P(0);
>> Oid element_type = AARR_ELEMTYPE(array);
>> TypeCacheEntry *typentry;
>>
>> /* Get information about range type; note column might be a domain */
>> typentry = range_get_typcache(fcinfo, getBaseType(element_type));
>>
>> return ranges_bounds_common(typentry, array, false);
>> }
>>
>> That saves 40 LOC.
>>
> Thanks, that's better. But I'm still not sure it's a good idea to add
> function with anyarray argument, when we need it to be an array of
> ranges ...
>
> I wonder if we have other functions doing something similar, i.e.
> accepting a polymorphic type and then imposing additional restrictions
> on it.

I couldn't find such examples, but adding an adhoc polymorphic type just
doesn't look right for me. Besides, you'll end up adding not just
anyrangearray type, but also anymultirangearray,
anycompatiblerangearray, anycompatiblemultirangearray, and maybe their
"non"-counterparts like anynonrangearray, and all of these are not of
much use. And one day you may need an array of arrays or something...

I wonder if it's possible to teach SQL to work with anyarray type - at
runtime the actual type of anyarray elements is known, right? In fact,
unnest() alone is enough to eliminate the need of C functions altogether.

>> Shouldn't this add some sql tests ?
>>
> Yeah, I guess we should have a couple tests calling these functions on
> different range arrays.
>
> This reminds me lower()/upper() have some extra rules about handling
> empty ranges / infinite boundaries etc. These functions should behave
> consistently (as if we called lower() in a loop) and I'm pretty sure
> that's not the current state.

I can try to tidy things up, but first we need to decide on the general
approach.

>
>
> regards
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2023-01-23 10:02:21 Re: Polyphase merge is obsolete
Previous Message Nitin Jadhav 2023-01-23 09:59:16 Re: Improve GetConfigOptionValues function