Re: Speed up ICU case conversion by using ucasemap_utf8To*()

From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org, Alexander Lakhin <exclusion(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, zengman <zengman(at)halodbtech(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speed up ICU case conversion by using ucasemap_utf8To*()
Date: 2026-04-13 06:35:57
Message-ID: AF5F542C-3703-45C4-B590-F02D00BFC809@proxel.se
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1 April 2026 02:46:23 CEST, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:
>My proposed fix is that we allocate a ULOC_LANG_CAPACITY buffer for the language like we do in fix_icu_locale_str() instead of trying to be clever. An alternative would be to use strncmp("tr", lang, 3) but that seems too clever for my taste in something which is not performance critical. A third option would be to check for U_STRING_NOT_TERMINATED_WARNING but I think that would just be unnecessarily convoluted code.
>
>I have attached my proposed fix.

Since it is likely I introduced or at least exposed this bug somehow I am adding this to the open items for PG 19.

Andreas

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message lakshmi 2026-04-13 06:57:08 Re: Extension - multilingual_fuzzy_match : Multilingual phonetic matching extension for PostgreSQL
Previous Message Alexandre Felipe 2026-04-13 06:22:22 Re: SLOPE - Planner optimizations on monotonic expressions.