Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

From: Isaac Morland <isaac(dot)morland(at)gmail(dot)com>
To: Joel Jacobson <joel(at)compiler(dot)org>
Cc: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andreas Karlsson <andreas(at)proxel(dot)se>, David Fetter <david(at)fetter(dot)org>
Subject: Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]
Date: 2021-03-02 13:34:56
Message-ID: CAMsGm5drQ2ENpapVbcCn+8oDHjhx6qRohgMdLh9vL+Mcd0Zj-Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2 Mar 2021 at 00:52, Joel Jacobson <joel(at)compiler(dot)org> wrote:

> Ranges are treated as sets. As such equality is defined by membership.
>
> That being said, I agree that there may be situations in which it would be
> convenient to have empty ranges at specific locations. Doing this would
> introduce numerous questions which would have to be resolved. For example,
> where/when is the empty range resulting from an intersection operation?
>
>
> Hmm, I think I would assume the intersection of two non-overlapping ranges
> to be isempty()=TRUE,
> and for lower() and upper() to continue to return NULL.
>
> But I think a zero-length range created with actual bounds should
> return the lower() and upper() values during creation, instead of NULL.
>
> I tried to find some other programming environments with range types.
>
> The first one I found was Ada.
>

Interesting!

Array indices are a bit different than general ranges however.

One question I would have is whether empty ranges are all equal to each
other. If they are, you have an equality that isn’t really equality; if
they aren’t then you would have ranges that are unequal even though they
have exactly the same membership. Although I suppose this is already true
for some types where ends can be specified as open or closed but end up
with the same end element; many range types canonicalize to avoid this but
I don’t think they all do.

Returning to the RE result issue, I wonder how much it actually matters
where any empty matches are. Certainly the actual contents of the match
don’t matter; you don’t need to be able to index into the string to extract
the substring. The only scenario I can see where it could matter is if the
RE is using lookahead or look back to find occurrences before or after
something else. If we stipulate that the result array will be in order,
then you still don’t have the exact location of empty matches but you do at
least have where they are relative to non-empty matches.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2021-03-02 13:58:16 Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]
Previous Message Dian M Fay 2021-03-02 13:34:50 Re: [PATCH] postgres-fdw: column option to override foreign types