From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | Peter Eisentraut <peter(at)eisentraut(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Remaining dependency on setlocale() |
Date: | 2025-06-11 19:15:14 |
Message-ID: | a8666c391dfcabe79868d95f7160eac533ace718.camel@j-davis.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, 2025-06-10 at 17:32 +0200, Peter Eisentraut wrote:
> v1-0001-copyfromparse.c-use-pg_ascii_tolower-rather-than-.patch
> v1-0002-contrib-spi-refint.c-use-pg_ascii_tolower-instead.patch
> v1-0003-isn.c-use-pg_ascii_toupper-instead-of-toupper.patch
> v1-0004-inet_net_pton.c-use-pg_ascii_tolower-rather-than-.patch
>
> These look good to me.
Committed. (That means they're in 18, which was not my intention, but
others seemed to think it was harmless enough, so I didn't revert. I
will wait for the branch before I commit any more of these.)
> v1-0005-Add-global_lc_ctype-to-hold-locale_t-for-datctype.patch
>
> This looks ok (but might depend on how patch 0006 turns out).
I changed this to a global_libc_locale that includes both LC_COLLATE
and LC_CTYPE (from datcollate and datctype), in case an extension is
relying on strcoll for some reason.
> v1-0006-Use-global_lc_ctype-for-callers-of-locale-aware-f.patch
>
> I think these need further individual analysis and explanation why
> these
> should use the global lc_ctype setting.
This patch series, at least so far, is designed to have zero behavior
changes. Anything with a potential for a behavior change should be a
separate commit, so that if we need to revert it, we can revert the
behavior change without reintroducing a setlocale() dependency.
> For example, you could argue
> that the SQL-callable soundex(text) function should use the collation
> object of its input value, not the global locale.
That would be a behavior change.
> But furthermore,
> soundex_code() could actually just use pg_ascii_toupper() instead.
Is that a behavior change?
> And
> in ts_locale.c, the isalnum_l() call should use mylocale that already
> exists in that function. The problem to solve it getting a good
> value
> into mylocale. Using the global setting confuses the issue a bit, I
> think.
I reworked it to be less confusing by changing wchar2char/char2wchar to
take a locale_t instead of pg_locale_t. Hopefully it's an improvement.
In get_iso_localename(), there's a comment saying that it doesn't
matter which locale is used (because it's ASCII), but to use the "_l"
variants, we need to pick some locale. At that point it's not clear to
me that global_libc_locale will be set yet, so I used LC_C_LOCALE.
I'm not sure whether we can rely on LC_C_LOCALE being available, but it
passed in CI, and if it's not available somewhere it might be a good
idea to create it on those platforms anyway.
> v1-0007-Fix-the-last-remaining-callers-relying-on-setloca.patch
>
> Do we have any data what platforms we'd need these checks for?
https://cirrus-ci.com/build/5167600088383488
Looks like windows doesn't have iswxdigit_l or isxdigit_l.
> Also, if you look into wparser_def.c what p_isxdigit is used for,
> it's
> used for parsing XML (presumably HTML) files, so we just need ASCII-
> only
> behavior and no locale dependency.
iswxdigit() does seem to be dependent on locale, so this could be a
subtle behavior change.
> v1-0008-Set-process-LC_COLLATE-C-and-LC_CTYPE-C.patch
>
> As I mentioned earlier in the thread, I don't think we can do this
> for
> LC_CTYPE, because otherwise system error messages would not come out
> in
> the right encoding.
Changed it so that it only sets LC_COLLATE to C, and leaves LC_CTYPE
set to datctype.
Unfortunately, as long as LC_CTYPE is set to a real locale, there's a
danger of accidentally depending on that setting. Can the encoding be
controlled with LC_MESSAGES instead of LC_CTYPE?
Do you have an example of how things can go wrong?
> For the LC_COLLATE settings, I think we could just
> do the setting in main(), where the other non-database-specific
> locale
> categories are set.
Done.
Regards,
Jeff Davis
Attachment | Content-Type | Size |
---|---|---|
v2-0001-Hold-datcollate-datctype-in-global_libc_locale.patch | text/x-patch | 5.2 KB |
v2-0002-fuzzystrmatch-use-global_libc_locale.patch | text/x-patch | 3.6 KB |
v2-0003-ltree-use-global_libc_locale.patch | text/x-patch | 666 bytes |
v2-0004-Use-global_libc_locale-for-downcase_identifier-an.patch | text/x-patch | 2.8 KB |
v2-0005-Change-wchar2char-and-char2wchar-to-accept-a-loca.patch | text/x-patch | 7.4 KB |
v2-0006-tsearch-use-global_libc_locale.patch | text/x-patch | 5.5 KB |
v2-0007-Force-LC_COLLATE-to-C-in-postmaster.patch | text/x-patch | 3.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2025-06-11 19:18:44 | Re: CREATE DATABASE command for non-libc providers |
Previous Message | David E. Wheeler | 2025-06-11 18:49:16 | Re: Inconsistent Behavior in JSONB Numeric Array Deletion |