pgsql: Fix INITCAP() word boundaries for PG_UNICODE_FAST.

From: Jeff Davis <jdavis(at)postgresql(dot)org>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Fix INITCAP() word boundaries for PG_UNICODE_FAST.
Date: 2025-04-21 19:35:39
Message-ID: E1u6wvb-0019RV-0H@gemulon.postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Fix INITCAP() word boundaries for PG_UNICODE_FAST.

Word boundaries are based on whether a character is alphanumeric or
not. For the PG_UNICODE_FAST collation, alphanumeric includes
non-ASCII digits; whereas for the PG_C_UTF8 collation, it only
includes digits 0-9. Pass down the right information from the
pg_locale_t into initcap_wbnext to differentiate the behavior.

Reported-by: Noah Misch <noah(at)leadboat(dot)com>
Reviewed-by: Noah Misch <noah(at)leadboat(dot)com>
Discussion: https://postgr.es/m/20250417135841.33.nmisch@google.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/90260e2ec6bbfc3dfa9d9501ab75c535de52f677

Modified Files
--------------
src/backend/utils/adt/pg_locale_builtin.c | 4 +++-
src/common/unicode/case_test.c | 13 ++++++++++++-
src/test/regress/expected/collate.utf8.out | 8 ++++++--
src/test/regress/sql/collate.utf8.sql | 2 ++
4 files changed, 23 insertions(+), 4 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message David Rowley 2025-04-21 23:04:33 pgsql: Doc: fix incorrect punctuation
Previous Message Tom Lane 2025-04-21 16:09:41 pgsql: Use the same cmd_context throughout a walsender's lifetime.