Re: add function argument names to regex* functions.

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, Dian Fay <di(at)nmfay(dot)com>, Jim Nasby <jim(dot)nasby(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: add function argument names to regex* functions.
Date: 2024-04-04 13:54:53
Message-ID: CACJufxGpgsQgnRvcPLczTV1Ak=JS0LmHxWe4q4dv6Qd1tRkQ9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 3, 2024 at 4:45 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> jian he <jian(dot)universality(at)gmail(dot)com> writes:
> > On Thu, Jan 18, 2024 at 4:17 PM Peter Eisentraut <peter(at)eisentraut(dot)org> wrote:
> >> Reading back through the discussion, I wasn't quite able to interpret
> >> the resolution regarding Oracle compatibility. From the patch, it looks
> >> like you chose not to adopt the parameter names from Oracle. Was that
> >> your intention?
>
> > per committee message:
> > https://git.postgresql.org/cgit/postgresql.git/commit/?id=6424337073589476303b10f6d7cc74f501b8d9d7
> > Even if the names are all the same, our function is still not the same
> > as oracle.
>
> The fact that there's minor discrepancies in the regex languages
> doesn't seem to me to have a lot of bearing on whether we should
> follow Oracle's choices of parameter names.
>
> However, if we do follow Oracle, it seems like we should do that
> consistently, which this patch doesn't. For instance, per [1]
> Oracle calls the arguments of regex_substr
>
> source_char,
> pattern,
> position,
> occurrence,
> match_param,
> subexpr
>
> while we have
>
> string,
> pattern,
> start,
> N,
> flags,
> subexpr
>
> The patch proposes to replace "N" with "occurrence" but not touch
> the other discrepancies, which seems to me to be a pretty poor
> choice. "occurrence" is very long and difficult to spell correctly,
> and if you're not following Oracle slavishly, exactly what is the
> argument in its favor? I quite agree that Oracle's other choices
> aren't improvements over ours, but neither is that one.
>
> On the whole my inclination would be to stick to the names we have
> in the documentation. There might be an argument for changing "N"
> to something lower-case so you don't have to quote it; but if we do,
> I'd go for, say, "count".
>

we have
---------------------------------------------------------------
The replacement string can contain \n, where n is 1 through 9, to
indicate that the source substring matching the n'th parenthesized
subexpression of the pattern should be inserted, and it can contain \&
to indicate that the substring matching the entire pattern should be
inserted.
----------------------------------------------------------------------------
in the regexp_replace explanation section.
changing "N" to lower-case would be misleading for regexp_replace?
so I choose "count".

By the way, I think the above is so hard to comprehend.
I can only find related test in src/test/regress/sql/strings.sql are:
SELECT regexp_replace('1112223333', E'(\\d{3})(\\d{3})(\\d{4})',
E'(\\1) \\2-\\3');
SELECT regexp_replace('foobarrbazz', E'(.)\\1', E'X\\&Y', 'g');
SELECT regexp_replace('foobarrbazz', E'(.)\\1', E'X\\\\Y', 'g');

but these tests seem not friendly.
maybe we should have some simple examples to demonstrate the above paragraph.

Attachment Content-Type Size
v4-0001-add-regex-functions-argument-names.patch text/x-patch 19.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-04-04 13:56:25 Re: Building with musl in CI and the build farm
Previous Message Melanie Plageman 2024-04-04 13:47:50 Re: Streaming read-ready sequential scan code