Re: [PATCH] proposal for regexp_count, regexp_instr, regexp_substr and regexp_replace

From: Gilles Darold <gilles(at)darold(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Chapman Flack <chap(at)anastigmatix(dot)net>, er(at)xs4all(dot)nl, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PATCH] proposal for regexp_count, regexp_instr, regexp_substr and regexp_replace
Date: 2021-08-01 19:22:41
Message-ID: b0e7ffa2-e776-2b6b-d0b1-bf3289439790@darold.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le 30/07/2021 à 23:38, Tom Lane a écrit :
> Gilles Darold <gilles(at)darold(dot)net> writes:
>> Le 26/07/2021 à 21:56, Tom Lane a écrit :
>>> I'm inclined to just drop the regexp_replace additions. I don't think
>>> that the extra parameters Oracle provides here are especially useful.
>>> They're definitely not useful enough to justify creating compatibility
>>> hazards for.
>> I would not say that being able to replace the Nth occurrence of a
>> pattern matching is not useful but i agree that this is not a common
>> case with replacement. Both Oracle [1] and IBM DB2 [2] propose this form
>> and I have though that we can not have compatibility issues because of
>> the different data type at the 4th parameter.
> Well, here's an example of the potential issues:
>
> [...]

Thanks for pointing me this case, I did not think that the prepared
statement could lead to this confusion.

>> Anyway, maybe we can just
>> rename the function even if I would prefer that regexp_replace() be
>> extended. For example:
>>     regexp_replace(source, pattern, replacement [, flags ]);
>>     regexp_substitute(source, pattern, replacement [, position ] [,
>> occurrence ] [, flags ]);
> Hmm. Of course the entire selling point of this patch seems to be
> bug-compatibility with Oracle, so using different names is largely
> defeating the point :-(
>
> Maybe we should just hold our noses and do it. The point that
> you'd get a recognizable failure if the wrong function were chosen
> reassures me a little bit. We've seen a lot of cases where this
> sort of ambiguity results in the system just silently doing something
> different from what you expected, and I was afraid that that could
> happen here.

I join a new version of the patch that include a check of the option
parameter in the basic form of regexp_replace() and return an error in
ambiguous cases.

PREPARE rr AS SELECT regexp_replace('healthy, wealthy, and
wise','(\w+)thy', '\1ish', $1);
EXECUTE rr(1);
ERROR:  ambiguous use of the option parameter in regex_replace(),
value: 1
HINT:  you might set the occurrence parameter to force the use of
the extended form of regex_replace()

This is done by checking if the option parameter value is an integer and
throw the error in this case. I don't think of anything better.

Best regards,

--
Gilles Darold

Attachment Content-Type Size
v5-0001-regexp-foo-functions.patch text/x-patch 73.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gilles Darold 2021-08-01 19:48:00 Re: [PATCH] proposal for regexp_count, regexp_instr, regexp_substr and regexp_replace
Previous Message Tomas Vondra 2021-08-01 17:59:18 Re: slab allocator performance issues