Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

From: "Joel Jacobson" <joel(at)compiler(dot)org>
To: "Isaac Morland" <isaac(dot)morland(at)gmail(dot)com>
Cc: "Mark Dilger" <mark(dot)dilger(at)enterprisedb(dot)com>, "Postgres hackers" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Andreas Karlsson" <andreas(at)proxel(dot)se>, "David Fetter" <david(at)fetter(dot)org>
Subject: Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]
Date: 2021-03-02 05:52:07
Message-ID: 7a4675b5-bdf1-40b5-971c-c0f84142705c@www.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 2, 2021, at 06:22, Isaac Morland wrote:
> On Tue, 2 Mar 2021 at 00:06, Joel Jacobson <joel(at)compiler(dot)org> wrote:
>> I find it strange two ranges of zero-length with different bounds are considered equal:
>>
>> SELECT '[7,7)'::int4range = '[8,8)'::int4range;
>> ?column?
>> ----------
>> t
>> (1 row)
>>
>> This seems like a bug to me. What am I missing here?
>>
>> Unless fixed, then the way I see it, I don't think we can use int4range[] for regexp_positions(),
>> if we want to allow returning the positions for zero-length matches, which would be nice.
>
> Ranges are treated as sets. As such equality is defined by membership.
>
> That being said, I agree that there may be situations in which it would be convenient to have empty ranges at specific locations. Doing this would introduce numerous questions which would have to be resolved. For example, where/when is the empty range resulting from an intersection operation?

Hmm, I think I would assume the intersection of two non-overlapping ranges to be isempty()=TRUE,
and for lower() and upper() to continue to return NULL.

But I think a zero-length range created with actual bounds should
return the lower() and upper() values during creation, instead of NULL.

I tried to find some other programming environments with range types.

The first one I found was Ada.

The below example is similar to int4range(7,6,'[]') which is invalid in PostgreSQL:

with Ada.Text_IO; use Ada.Text_IO;
procedure Hello is
type Foo is range 7 .. 6;
begin
Put_Line ( Foo'Image(Foo'First) );
Put_Line ( Foo'Image(Foo'Last) );
end Hello;

$ ./gnatmake hello.adb
$ ./hello
7
6

I Ada, the 'Range of the Empty_String is 1 .. 0
https://en.wikibooks.org/wiki/Ada_Programming/Types/array#Array_Attributes

I think there is a case for allowing access to the the lower/upper vals instead of returning NULL,
since we can do so without changing what isempty() would return for the same values,.

/Joel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-03-02 06:04:58 Re: Add --tablespace option to reindexdb
Previous Message Andrey Borodin 2021-03-02 05:50:39 Re: libpq compression