Re: Remaining dependency on setlocale()

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: Jeff Davis <pgsql(at)j-davis(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Remaining dependency on setlocale()
Date: 2025-06-10 15:32:01
Message-ID: 9875f7f9-50f1-4b5d-86fc-ee8b03e8c162@eisentraut.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 07.06.25 00:23, Jeff Davis wrote:
> On Thu, 2025-06-05 at 22:15 -0700, Jeff Davis wrote:
>> To continue this thread, I did a symbol search in the meson build
>> directory like (patterns.txt attached):
>
> Attached a rough patch series which does what everyone seemed to agree
> on:
>
> * Change some trivial ASCII cases to use pg_ascii_* variants
> * Set LC_COLLATE and LC_CTYPE to C with pg_perm_setlocale
> * Introduce a new global_lc_ctype for callers that still need to use
> operations that depend on datctype

v1-0001-copyfromparse.c-use-pg_ascii_tolower-rather-than-.patch
v1-0002-contrib-spi-refint.c-use-pg_ascii_tolower-instead.patch
v1-0003-isn.c-use-pg_ascii_toupper-instead-of-toupper.patch
v1-0004-inet_net_pton.c-use-pg_ascii_tolower-rather-than-.patch

These look good to me.

v1-0005-Add-global_lc_ctype-to-hold-locale_t-for-datctype.patch

This looks ok (but might depend on how patch 0006 turns out).

v1-0006-Use-global_lc_ctype-for-callers-of-locale-aware-f.patch

I think these need further individual analysis and explanation why these
should use the global lc_ctype setting. For example, you could argue
that the SQL-callable soundex(text) function should use the collation
object of its input value, not the global locale. But furthermore,
soundex_code() could actually just use pg_ascii_toupper() instead. And
in ts_locale.c, the isalnum_l() call should use mylocale that already
exists in that function. The problem to solve it getting a good value
into mylocale. Using the global setting confuses the issue a bit, I think.

v1-0007-Fix-the-last-remaining-callers-relying-on-setloca.patch

Do we have any data what platforms we'd need these checks for?

Also, if you look into wparser_def.c what p_isxdigit is used for, it's
used for parsing XML (presumably HTML) files, so we just need ASCII-only
behavior and no locale dependency.

v1-0008-Set-process-LC_COLLATE-C-and-LC_CTYPE-C.patch

As I mentioned earlier in the thread, I don't think we can do this for
LC_CTYPE, because otherwise system error messages would not come out in
the right encoding. For the LC_COLLATE settings, I think we could just
do the setting in main(), where the other non-database-specific locale
categories are set.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2025-06-10 15:38:29 Re: add function for creating/attaching hash table in DSM registry
Previous Message Andrew Johnson 2025-06-10 14:40:23 [PATCH v1] Add pg_stat_multixact view for multixact membership usage monitoring