| From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
|---|---|
| To: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, adam(dot)warland(at)infor(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org |
| Subject: | Re: BUG #19341: REPLACE() fails to match final character when using nondeterministic ICU collation |
| Date: | 2025-12-02 16:36:06 |
| Message-ID: | 6935ea1e-cfb1-400b-9057-3d43b55d621a@iki.fi |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
On 02/12/2025 18:24, Laurenz Albe wrote:
> On Tue, 2025-12-02 at 10:03 +0000, PG Bug reporting form wrote:
>> PostgreSQL version: 18.1
>>
>> When using a nondeterministic ICU collation, the replace() function fails to
>> replace a substring when that substring appears at the end of the input
>> string.
>>
>> Occurrences of the same substring earlier in the string are replaced
>> normally.
>>
>> Specific collation used:
>> create collation test_nondeterministic (
>> provider = icu,
>> locale = 'und-u-ks-level2',
>> deterministic = false
>> )
>>
>> -- Replace final character under nondeterministic collation
>> SELECT replace(
>> 'testx' COLLATE "test_nondeterministic",
>> 'x' COLLATE "test_nondeterministic",
>> 'y') AS res1;
>
> I can reproduce the problem, and the attached patch fixes it for me.
+1, looks good to me. Let's also add a regression test for this.
> I am not certain if it is safe to apply pg_mblen() to "haystack_end", though.
It doesn't do that though, does it? There are two pg_mblen() calls in
the vicinity:
> for (const char *test_end = hptr; test_end <= haystack_end; test_end += pg_mblen(test_end))
> {
> if (pg_strncoll(hptr, (test_end - hptr), needle, needle_len, state->locale) == 0)
> {
> state->last_match_len_tmp = (test_end - hptr);
> result_hptr = hptr;
> if (!state->greedy)
> break;
> }
> }
> if (result_hptr)
> break;
>
> hptr += pg_mblen(hptr);
Neither of those will get called with 'haystack_end' as far as I can see.
- Heikki
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Laurenz Albe | 2025-12-02 17:18:11 | Re: BUG #19341: REPLACE() fails to match final character when using nondeterministic ICU collation |
| Previous Message | Laurenz Albe | 2025-12-02 16:31:54 | Re: BUG #19341: REPLACE() fails to match final character when using nondeterministic ICU collation |