Re: BUG #19341: REPLACE() fails to match final character when using nondeterministic ICU collation

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, adam(dot)warland(at)infor(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #19341: REPLACE() fails to match final character when using nondeterministic ICU collation
Date: 2025-12-02 17:18:11
Message-ID: 9f47c6b24d6cf59002603caf19b8b3f4854176be.camel@cybertec.at
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, 2025-12-02 at 18:36 +0200, Heikki Linnakangas wrote:
> +1, looks good to me. Let's also add a regression test for this.

Right, done in the attached.

> > I am not certain if it is safe to apply pg_mblen() to "haystack_end", though.
>
> It doesn't do that though, does it? There are two pg_mblen() calls in
> the vicinity:
>
> > for (const char *test_end = hptr; test_end <= haystack_end; test_end += pg_mblen(test_end))
> > {
> > if (pg_strncoll(hptr, (test_end - hptr), needle, needle_len, state->locale) == 0)
> > {
> > state->last_match_len_tmp = (test_end - hptr);
> > result_hptr = hptr;
> > if (!state->greedy)
> > break;
> > }
> > }
> > if (result_hptr)
> > break;
> >
> > hptr += pg_mblen(hptr);
>
> Neither of those will get called with 'haystack_end' as far as I can see.

During the last iteration of the loop, "test_end" will be equal to "haystack_end",
and the loop increment will call "pg_mblen(test_end)".

Yours,
Laurenz Albe

Attachment Content-Type Size
v2-0001-Fix-greedy-substring-search-for-non-deterministic.patch text/x-patch 2.9 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2025-12-02 17:22:07 Re: BUG #19340: Wrong result from CORR() function
Previous Message Heikki Linnakangas 2025-12-02 16:36:06 Re: BUG #19341: REPLACE() fails to match final character when using nondeterministic ICU collation