Re: Remaining dependency on setlocale()

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: Jeff Davis <pgsql(at)j-davis(dot)com>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Remaining dependency on setlocale()
Date: 2025-12-05 15:01:35
Message-ID: 38f3f4a6-e95e-47a7-914f-110dd6191301@eisentraut.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 29.11.25 21:50, Jeff Davis wrote:
> All fixed, thank you! (I apologize for posting a patch in that state to
> begin with...)
>
> I also reorganized slightly to separate out the pg_iswcased() API into
> its own patch, and moved the like_support.c changes from the ctype_is_c
> patch (already committed: 1476028225) into the pattern prefixes patch.

I reviewed the v11 patches. But I wasn't able to apply them locally
(couldn't find a starting commit where they applied cleanly), so I
haven't tested them.

Patches 0001 through 0006 seem generally ok, with some small comments:

v11-0003-Fix-inconsistency-between-ltree_strncasecmp-and-.patch

The function comment reads "Check if b has a prefix of a." -- Is that
the same as "Check if a is a prefix of b."? The latter might be clearer.

v11-0004-Remove-char_tolower-API.patch

The updated comment reads

+ * For efficiency reasons, in the C locale we don't call lower()
on the
+ * pattern and text, but instead call SB_lower_char on each
character.

but the patch removes SB_lower_char().

v11-0006-Use-multibyte-aware-extraction-of-pattern-prefix.patch

Might have a small typo in the commit message:

; and preserve and char-at-a-time logic for bytea.

For the remaining patches I have some more substantial questions.

v11-0007-fuzzystrmatch-use-pg_ascii_toupper.patch

dmetaphone.c has a comment

case '\xc7': /* C with cedilla */

so the premise that "fuzzystrmatch is designed for ASCII" does not
appear to be correct. Needs more analysis.

(But apparently it's not multibyte aware at all, so I don't know what to
do about that.)

v11-0008-downcase_identifier-use-method-table-from-locale.patch

I'm confused here about the name of the function pg_strfold_ident(). In
general, case "folding" results in an opaque string that is really only
useful for comparing against other case-folded strings. But for
identifiers we are actually interested lower-casing. I think this
should be corrected in the API naming.

v11-0009-Control-LC_COLLATE-with-GUC.patch

I know there were some complaints about compatibility with extensions,
but I don't think anything concrete was presented. I would like to see
more evidence that we need this.

Also, recall that we used to have a lc_collate GUC, and in the end
people got confused that it didn't actually show a meaningful value when
you used ICU. So we removed that. It seems adding this back in would
create a similar kind of confusion. So to avoid that, maybe this should
be called fallback_lc_collate or something like that.

If we were to proceed with this patch, it should have some documentation
and tests.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2025-12-05 15:20:57 Re: Issues with ON CONFLICT UPDATE and REINDEX CONCURRENTLY
Previous Message Jim Vanns 2025-12-05 14:59:07 [PATCH] Add support for SAOP in the optimizer for partial index paths